havcs-410-101 a-2-10-srt-pg_4

VERITAS Cluster Server for UNIX, Implementing Local Clusters HA-VCS-410-101A-2-10-SRT (100-002148)

COURSE DEVELOPERSBilge GerritsSiobhan SeegerDawn Walker

LEAD SUBJECT MATTER EXPERTS

Geoff BergrenConnie EconomouPaul JohnstonDave RogersPete ToemmesJim Senicka

TECHNICAL CONTRIBUTORS AND REVIEWERS

Billie BachraBarbara CeranGene HenriksenBob Lucas

Disclaimer

The information contained in this publication is subject to change without notice. VERITAS Software Corporation makes no warranty of any kind with regard to this guide, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. VERITAS Software Corporation shall not be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance, or use of this manual.

Copyright

Copyright © 2005 VERITAS Software Corporation. All rights reserved. No part of the contents of this training material may be reproduced in any form or by any means or be used for the purposes of training or education without the written permission of VERITAS Software Corporation.

Trademark Notice

VERITAS, the VERITAS logo, and VERITAS FirstWatch, VERITAS Cluster Server, VERITAS File System, VERITAS Volume Manager, VERITAS NetBackup, and VERITAS HSM are registered trademarks of VERITAS Software Corporation. Other product names mentioned herein may be trademarks and/or registered trademarks of their respective companies.

VERITAS Cluster Server for UNIX, Implementing Local Clusters Participant Guide

April 2005 Release

VERITAS Software Corporation350 Ellis StreetMountain View, CA 94043Phone 650–527–8000 www.veritas.com

Table of Contents iCopyright © 2005 VERITAS Software Corporation. All rights reserved.

Course IntroductionVERITAS Cluster Server Curriculum ................................................................ Intro-2Course Prerequisites......................................................................................... Intro-3Course Objectives............................................................................................. Intro-4

Lesson 1: Workshop: Reconfiguring Cluster MembershipIntroduction ............................................................................................................. 1-2Workshop Overview ................................................................................................ 1-4Task 1: Removing a System from a Running VCS Cluster..................................... 1-5

Objective ................................................................................................................... 1-5Assumptions.............................................................................................................. 1-5Procedure for Removing a System from a Running VCS Cluster............................ 1-6Solution to Class Discussion 1: Removing a System ............................................... 1-9Commands Required to Complete Task 1 .............................................................. 1-11Solution to Class Discussion 1: Commands for Removing a System .................... 1-14Lab Exercise: Task 1—Removing a System from a Running Cluster.................... 1-18

Task 2: Adding a New System to a Running VCS Cluster.................................... 1-19Objective ................................................................................................................. 1-19Assumptions............................................................................................................ 1-19Procedure to Add a New System to a Running VCS Cluster ................................. 1-20Solution to Class Discussion 2: Adding a System.................................................. 1-23Commands Required to Complete Task 2 .............................................................. 1-25Solution to Class Discussion 2: Commands for Adding a System ......................... 1-28Lab Exercise: Task 2—Adding a New System to a Running Cluster .................... 1-32

Task 3: Merging Two Running VCS Clusters........................................................ 1-33Objective ................................................................................................................. 1-33Assumptions............................................................................................................ 1-33Procedure to Merge Two VCS Clusters.................................................................. 1-34Solution to Class Discussion 3: Merging Two Running Clusters .......................... 1-37Commands Required to Complete Task 3 .............................................................. 1-39Solution to Class Discussion 3: Commands to Merge Clusters.............................. 1-42Lab Exercise: Task 3—Merging Two Running VCS Clusters............................... 1-46

Lab 1: Reconfiguring Cluster Membership............................................................ 1-48

Lesson 2: Service Group InteractionsIntroduction ............................................................................................................. 2-2Common Application Relationships ........................................................................ 2-4

Online on the Same System ...................................................................................... 2-4Online Anywhere in the Cluster ............................................................................... 2-5Online on Different Systems..................................................................................... 2-6Offline on the Same System ..................................................................................... 2-7

Service Group Dependency Definition .................................................................... 2-8Startup Behavior Summary....................................................................................... 2-8Failover Behavior Summary ..................................................................................... 2-9

Table of Contents

ii VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Service Group Dependency Examples ................................................................. 2-10Online Local Dependency...................................................................................... 2-10Online Global Dependency .................................................................................... 2-14Online Remote Dependency .................................................................................. 2-16Offline Local Dependency ..................................................................................... 2-18

Configuring Service Group Dependencies............................................................ 2-19Service Group Dependency Rules ......................................................................... 2-19Creating Service Group Dependencies .................................................................. 2-20Removing Service Group Dependencies ............................................................... 2-20

Alternative Methods of Controlling Interactions..................................................... 2-21Limitations of Service Group Dependencies ......................................................... 2-21Using Resources to Control Service Group Interactions ....................................... 2-22Using Triggers to Control Service Group Interactions .......................................... 2-24

Lab 2: Service Group Dependencies .................................................................... 2-26

Lesson 3: Workload ManagementIntroduction ............................................................................................................. 3-2Startup Rules and Policies ...................................................................................... 3-4

Rules for Automatic Service Group Startup ............................................................. 3-4Automatic Startup Policies........................................................................................ 3-5

Failover Rules and Policies................................................................................... 3-10Rules for Automatic Service Group Failover......................................................... 3-10Failover Policies...................................................................................................... 3-11Integrating Dynamic Load Calculations ................................................................ 3-15

Controlling Overloaded Systems........................................................................... 3-16The LoadWarning Trigger ..................................................................................... 3-16Example Script ....................................................................................................... 3-17

Additional Startup and Failover Controls............................................................... 3-18Limits and Prerequisites ......................................................................................... 3-18Selecting a Target System...................................................................................... 3-19Combining Capacity and Limits ............................................................................ 3-20

Configuring Startup and Failover Policies ............................................................. 3-21Setting Load and Capacity ..................................................................................... 3-21Setting Limits and Prerequisites............................................................................. 3-22

Using the Simulator............................................................................................... 3-24Modeling Workload Management ......................................................................... 3-24

Lab 3: Testing Workload Management ................................................................. 3-26

Lesson 4: Alternate Storage and Network ConfigurationsIntroduction ............................................................................................................. 4-2Alternative Storage and Network Configurations .................................................... 4-4

The Disk Resource and Agent on Solaris ................................................................. 4-5The DiskReservation Resource and Agent on Solaris .............................................. 4-5The LVMVolumeGroup Agent on AIX.................................................................... 4-6LVM Setup on HP-UX.............................................................................................. 4-7The LVMVolumeGroup Resource and Agent on HP-UX........................................ 4-8LVMLogicalVolume Resource and Agent on HP-UX ............................................. 4-9

Table of Contents iiiCopyright © 2005 VERITAS Software Corporation. All rights reserved.

LVMCombo Resource and Agent on HP-UX .......................................................... 4-9The DiskReservation Resource and Agent on Linux.............................................. 4-10Alternative Network Configurations....................................................................... 4-11Network Resources Overview ................................................................................ 4-13

Additional Network Resources .............................................................................. 4-14The MultiNICA Resource and Agent ..................................................................... 4-14MultiNICA Resource Configuration....................................................................... 4-17MultiNICA Failover................................................................................................ 4-20The IPMultiNIC Resource and Agent..................................................................... 4-21IPMultiNIC Failover............................................................................................... 4-25

Additional Network Design Requirements............................................................. 4-26MultiNICB and IPMultiNICB ................................................................................ 4-26How the MultiNICB Agent Operates ..................................................................... 4-27The MultiNICB Resource and Agent ..................................................................... 4-29The IPMultiNICB Resource and Agent.................................................................. 4-36Configuring IPMultiNICB...................................................................................... 4-37The MultiNICB Trigger.......................................................................................... 4-39

Example MultiNIC Setup ....................................................................................... 4-40Comparing MultiNICA and MultiNICB................................................................. 4-41Testing Local Interface Failover............................................................................. 4-42

Lab 4: Configuring Multiple Network Interfaces .................................................... 4-44

Lesson 5: Maintaining VCSIntroduction ............................................................................................................. 5-2Making Changes in a Cluster Environment............................................................. 5-4

Replacing a System................................................................................................... 5-4Preparing for Software and Hardware Upgrades ...................................................... 5-5Operating System Upgrade Example........................................................................ 5-6Performing a Rolling Upgrade in a Running Cluster................................................ 5-7

Upgrading VERITAS Cluster Server ....................................................................... 5-8Preparing for a VCS Upgrade ................................................................................... 5-8Upgrading to VCS 4.x from VCS 1.3—3.5 .............................................................. 5-9Upgrading from VCS QuickStart to VCS 4.x......................................................... 5-10Other Upgrade Considerations................................................................................ 5-11

Alternative VCS Installation Methods.................................................................... 5-12Options to the installvcs Utility .............................................................................. 5-12Options and Features of the installvcs Utility......................................................... 5-12Manual Installation Procedure ................................................................................ 5-14Licensing VCS........................................................................................................ 5-16Creating a Single-Node Cluster .............................................................................. 5-17

Staying Informed ................................................................................................... 5-18Obtaining Information from VERITAS Support .................................................... 5-18

Lesson 6: Validating VCS ImplementationIntroduction ............................................................................................................. 6-2VCS Best Practices Review .................................................................................... 6-4

Cluster Interconnect .................................................................................................. 6-4

iv VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Shared Storage .......................................................................................................... 6-5Public Network.......................................................................................................... 6-6Failover Configuration .............................................................................................. 6-7External Dependencies.............................................................................................. 6-8Testing....................................................................................................................... 6-9Other Considerations.............................................................................................. 6-10

Solution Acceptance Testing ................................................................................ 6-11Examples of Solution Acceptance Testing ............................................................ 6-12

Knowledge Transfer .............................................................................................. 6-13System and Network Administration ..................................................................... 6-13Application Administration.................................................................................... 6-14The Implementation Report ................................................................................... 6-15

High Availability Solutions ..................................................................................... 6-16Local Cluster with Shared Storage......................................................................... 6-16Campus or Metropolitan Shared Storage Cluster................................................... 6-17Replicated Data Cluster (RDC).............................................................................. 6-18Wide Area Network (WAN) Cluster for Disaster Recovery ................................. 6-19High Availability References ................................................................................. 6-20VERITAS High Availability Curriculum .............................................................. 6-22

Appendix A: Lab SynopsesLab 1 Synopsis: Reconfiguring Cluster Membership .............................................. A-2Lab 2 Synopsis: Service Group Dependencies....................................................... A-7Lab 3 Synopsis: Testing Workload Management.................................................. A-14Lab 4 Synopsis: Configuring Multiple Network Interfaces..................................... A-20

Appendix B: Lab DetailsLab 1 Details: Reconfiguring Cluster Membership.................................................. B-3Lab 2 Details: Service Group Dependencies ........................................................ B-17Lab 3 Details: Testing Workload Management ..................................................... B-29Lab 4 Details: Configuring Multiple Network Interfaces ........................................ B-37

Appendix C: Lab SolutionsLab Solution 1: Reconfiguring Cluster Membership................................................ C-3Lab 2 Solution: Service Group Dependencies ...................................................... C-25Lab 3 Solution: Testing Workload Management ................................................... C-45Lab 4 Solution: Configuring Multiple Network Interfaces ...................................... C-63

Appendix D: Job AidsService Group Dependencies—Definitions............................................................. D-2Service Group Dependencies—Failover Process................................................... D-6

Appendix E: Design Worksheet: Template

Index

Course Introduction

Intro–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

VERITAS Cluster Server CurriculumThe VERITAS Cluster Server curriculum is a series of courses that are designed to provide a full range of expertise with VERITAS Cluster Server (VCS) high availability solutions—from design through disaster recovery.

VERITAS Cluster Server for UNIX, FundamentalsThis course covers installation and configuration of common VCS configurations, focusing on two-node clusters running application and database services.

VERITAS Cluster Server for UNIX, Implementing Local ClustersThis course focuses on multinode VCS clusters and advanced topics related to more complex cluster configurations.

VERITAS Cluster Server Agent DevelopmentThis course enables students to create and customize VCS agents.

High Availability Design Using VERITAS Cluster Server This course enables participants to translate high availability requirements into a VCS design that can be deployed using VERITAS Cluster Server.

Disaster Recovery Using VVR and Global Cluster Option This course covers cluster configurations across remote sites, including Replicated Data Clusters (RDCs) and the Global Cluster Option for wide-area clusters.

Learning Path

VERITAS Cluster Server,

Implementing Local Clusters

Disaster Recovery Using VVR and Global

Cluster Option

High AvailabilityDesign Using

VERITAS Cluster Server

VERITAS Cluster Server, Fundamentals

VERITAS Cluster Server Curriculum

VERITAS Cluster Server Agent

Development

Course Introduction Intro–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Course PrerequisitesThis course assumes that you have complete understanding of the fundamentals of the VERITAS Cluster Server (VCS) product. You should understand the basic components and functions of VCS before you begin to implement a high availability environment using VCS.

You are also expected to have expertise in system, storage, and network administration of UNIX systems.

Course PrerequisitesTo successfully complete this course, you are expected to have:

The level of experience gained in the VERITAS Cluster Server Fundamentals course:– Understanding VCS terms and concepts– Using the graphical and command-line interfaces– Creating and managing service groups – Responding to resource, system, and communication

faultsSystem, storage, and network administration expertise with one or more UNIX-based operating systems


Course ObjectivesIn the VERITAS Cluster Server Implementing Local Clusters course, you are given a high availability design to implement in the classroom environment using VERITAS Cluster Server.

The course simulates the job tasks that you perform to configure advanced cluster features. Lessons build upon each other, exhibiting the processes and recommended best practices that you can apply to implementing any design cluster.

The core material focuses on the most common cluster implementations. Other cluster configurations emphasizing additional VCS capabilities are provided to illustrate the power and flexibility of VERITAS Cluster Server.

Course ObjectivesAfter completing the VERITAS Cluster Server Implementing Local Clusters course, you will be able to:

Reconfigure cluster membership to add and remove systems from a cluster.Configure dependencies between service groups.Manage workload among cluster systems.Implement alternative storage and network configurations.Perform common maintenance tasks.Validate your cluster implementation.

Course Introduction Intro–5Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Lab Design for the CourseThe diagram shows a conceptual view of the cluster design used as an example throughout this course and implemented in hands-on lab exercises.

Each aspect of the cluster configuration is described in greater detail where applicable in course lessons.

The cluster consists of:• Four nodes• Three to five high availability services, including Oracle• Fibre connections to SAN shared storage from each node through a switch• Two Ethernet interfaces for the private cluster heartbeat network• Ethernet connections to the public network

Additional complexity is added to the design to illustrate certain aspects of cluster configuration in later lessons. The design diagram shows a conceptual view of the cluster design described in the worksheet.

Lab Design for the Course

vcs1

name1SG1, name1SG2

name2SG1, name2SG2

NetworkSG


Course OverviewThis training provides comprehensive instruction on the deployment of advanced features of VERITAS Cluster Server (VCS). The course focuses on multinode VCS clusters and advanced topics related to more complex cluster configurations, such as service group dependencies and workload management.

Course Overview

Lesson 1: Reconfiguring Cluster MembershipLesson 2: Service Group InteractionsLesson 3: Workload ManagementLesson 4: Storage and Network AlternativesLesson 5: Maintaining VCSLesson 6: Validating VCS Implementation

Lesson 1Workshop: Reconfiguring Cluster Membership

1–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

IntroductionOverviewThis lesson is a workshop to teach you to think through impacts of changing the cluster configuration while maximizing the application services availability and plan accordingly. The workshop also provides the means of reviewing everything you have learned so far about VCS clusters.

ImportanceTo maintain existing VCS clusters and clustered application services, you may be required to add or remove systems to and from existing VCS clusters or merge clusters to consolidate servers. You need to have a very good understanding of how VCS works and how the configuration changes impact the application services availability before you can plan and execute these changes in a cluster.

Lesson IntroductionLesson 1: Reconfiguring Cluster MembershipLesson 2: Service Group InteractionsLesson 3: Workload ManagementLesson 4: Storage and Network AlternativesLesson 5: Maintaining VCSLesson 6: Validating VCS Implementation

Lesson 1 Workshop: Reconfiguring Cluster Membership 1–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

1

Outline of Topics• Task 1: Removing a System • Task 2: Adding a System • Task 3: Merging Two Running VCS Clusters

Labs and solutions are located on the following pages.

“Lab 1 Synopsis: Reconfiguring Cluster Membership,” page A-2“Lab 1 Details: Reconfiguring Cluster Membership,” page B-3“Lab Solution 1: Reconfiguring Cluster Membership,” page C-3

Merge two running VCS clusters.Task 3: Merging Two Running Clusters

Add a new system to a running VCS cluster.

Task 2: Adding a System

Remove a system from a running cluster.

Task 1: Removing a System

After completing this lesson, you will be able to:

Topic

Lesson Topics and Objectives


Workshop OverviewDuring this workshop, you will change two 2-node VCS clusters into a 4-node VCS cluster with the same application services. The workshop is carried out in three parts:• Task 1: Removing a system from a running VCS cluster• Task 2: Adding a new system to a running VCS cluster• Task 3: Merging two running VCS clusters

Note: During this workshop students working on two clusters need to team up to carry out the discussions and the lab exercises.

Each task has three parts:1 Your instructor will first describe the objective and the assumptions related to

the task. Then you will be asked as a team to provide a procedure to accomplish the task while maximizing application services availability. You will then review the procedure in the class discussing the reasons behind each step.

2 After you have identified the best procedure for the task, you will be asked as a team to provide the VCS commands to carry out each step in the procedure. This will again be followed up by a classroom discussion to identify the possible solutions to the problem.

3 After the task is planned in detail, you carry out the task as a team on the lab systems in the classroom.

You need to complete one task before proceeding to the next.

Reconfiguring Cluster Membership

BA A

B BA

DC C

DC

C DCD

B

B

C DD

1 2

3 4 3 4

4

2

2

2

1

1 3DC

BBC

D

AA

Task 1

Task 2

Task 3

D

AC A


1

Task 1: Removing a System from a Running VCS ClusterObjectiveThe objective of this task is to take a system out of a running VCS cluster and to remove the VCS software on the system with minimal or no impact on application services.

AssumptionsFollowing is a list of assumptions that you need to take into account while planning a procedure for this task:• The VCS cluster consists of two or more systems, all of which are up and

running.• There are multiple service groups configured in the cluster. All of the service

groups are online somewhere in the cluster. Note that there may also be online service groups on the system that need to be removed from the cluster.

• The application services that are online on the system to be removed from the cluster can be switched over to other systems in the cluster.– Although there are multiple service groups in the cluster, this assumption

implies that there are no dependencies that need to be taken into account.– There are also no service groups that are configured to run only on the

system to be removed from the cluster.• All the VCS software should be removed from the system because it is no

longer part of a cluster. However, there is no need to remove any application software from the system.

Task 1: Removing a System from a Running VCS Cluster

Objective To remove a system from a running VCS cluster while minimizing application and VCS downtimeAssumptions– The cluster has two or more systems.– There are multiple service groups, some

of which may be running on the system to be removed.

– All application services should be kept under the cluster control.

– There is nothing to restrict switching over application services to the remaining systems in the cluster.

– VCS software should be removed from the system taken out of the cluster.

X


Procedure for Removing a System from a Running VCS ClusterDiscuss with your class or team the steps required to carry out Task 1. For each step, decide how the application services availability would be impacted. Note that there may not be a single answer to this question. Therefore, state your reasons for choosing a step in a specific order using the Notes area of your worksheet. Also, in the Notes area, document any assumptions that you are making that have not been explained as part of the task.

Use the worksheet on the following page to provide the steps required for Task 1.

Classroom Discussion for Task 1Your instructor either groups students into teams orleads a class discussion for this task.For team-based exercises:

Each group of four students, working on two clusters, forms a team to discuss the steps required to carry out task 1 as outlined on the previous slide.After all the teams are ready with their proposed procedures, have a classroom discussion to identify the best way of removing a system from a running VCS cluster, providing the reasons for each step.

Note: At this point, you do not need to provide the commands to carry out each step.Note: At this point, you do not need to provide the commands to carry out each step.

X


1

Procedure for Task 1 proposed by your team or class:

Steps Description Impact on applicationavailability

Notes


Use the following worksheet to document the procedure agreed upon in the classroom.

Final procedure for Task 1 agreed upon as a result of classroom discussions:


Notes


1

Solution to Class Discussion 1: Removing a System1 Open the configuration and prevent application failover to the system to be

removed.2 Switch any application services that are running on the system to be removed

to any other system in the cluster.Note: This step can be combined with either step 1 as an option to a single command line.

3 Close the configuration and stop VCS on the system to be removed.4 Remove any disk heartbeat configuration on the system to be removed.

Notes: – You need to remove both the GAB disk heartbeats and service group

heartbeats.– After you remove the GAB disk heartbeats, you may also remove the

corresponding lines in the /etc/gabtab file that starts the disk heartbeat so that the disk heartbeats are not started again in case the system crashes and is rebooted before you remove the VCS software.

5 Stop VCS communication modules (GAB, LLT) and I/O fencing on the system to be removed.Note: On the Solaris platform, you also need to unload the kernel modules.

6 Physically remove cluster interconnect links from the system to be removed.7 Remove VCS software from the system taken out of the cluster.

Notes: – You can either use the uninstallvcs script to automate the removal of

the VCS software or use the command specific to the operating platform, such as pkgadd for Solaris, swinstall for HP-UX, installp -a for AIX, or rpm for Linux, to remove the VCS software packages individually.

– If you have remote shell access (rsh or ssh) for root between the cluster systems, you can run uninstallvcs on any system in the cluster. Otherwise, you have to run the script on the system to be removed.

– You may need to manually remove configuration files and VCS directories that include customized scripts.

8 Update service group and resource configurations that refer to the system that is removed.Note: Service group attributes, such as AutoStartList, SystemList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.

9 Remove the system from the cluster configuration.


10 Modify the VCS communication configuration files on the remaining systems in the cluster to reflect the change.Note: You do not need to stop and restart LLT and GAB on the remaining systems when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.


1

Commands Required to Complete Task 1After you have agreed on the steps required to accomplish Task 1, determine which VCS commands are used to carry out each step in the procedure. You will first work as a team to propose a solution, and then discuss each step in the classroom. Note that there may be multiple methods to carry out each step.

You can use the Participant Guide, VCS manual pages, the VERITAS Cluster Server User’s Guide, and the VERITAS Cluster Server Installation Guide as sources of information. If there are topics that you do not feel comfortable with, ask your instructor to discuss them in detail during the classroom discussion.

Use the worksheet on the following page to provide the commands required for Task 1.

VCS Commands Required for Task 1Provide the commands to carry out each step in therecommended procedure for removing a system from arunning VCS cluster.

You may need to refer to previous lessons, VCS manuals, or manual pages to decide on the specific commands and their options.For each step, complete the worksheet provided in the Participant Guide and include the command, the system to run it on, and any specific notes.

XNote: When you are ready, your instructor will discuss each step in detail.Note: When you are ready, your instructor will discuss each step in detail.


Commands for Task 1 proposed by your team:

Order of Execution

VCS Command to Use System on which to run the command

Notes


1

Use the following worksheet to document any differences to your proposal.

Commands for Task 1 agreed upon in the classroom:

Order of Execution


Notes


Solution to Class Discussion 1: Commands for Removing a System1 Open the configuration and prevent application failover to the system to be

removed, persisting through VCS restarts.

haconf -makerwhasys -freeze -persistent -evacuate train2

2 Switch any application services that are running on the system to be removed to any other system in the cluster.Note: You can combine this step with step 1 as an option to a single command line.

This step has been combined with step 1.

3 Close the configuration and stop VCS on the system to be removed.

haconf -dump -makerohastop -sys train2

Note: You can accomplish steps 1-3 using the following commands:

haconf -makerwhasys -freeze train2haconf -dump -makerohastop -sys train2 -evacuate

4 Remove any disk heartbeat configuration on the system to be removed.Notes: – Remove both the GAB disk heartbeats and service group heartbeats.– After you remove the GAB disk heartbeats, also remove the corresponding

lines in the /etc/gabtab file that starts the disk heartbeat so that the disk heartbeats are not started again in case the system crashes and is rebooted before you remove the VCS software.

gabdiskhb -lgabdiskhb –d devicename -s startgabdiskx -lgabdiskx -d devicename -s start

Also, remove the lines starting with gabdiskhb -a in the /etc/gabtab file.


1

5 Stop VCS communication modules (GAB, LLT) and fencing on the system to be removed.Note: On the Solaris platform, unload the kernel modules.

On the system to be removed, train2 in this example:

/etc/init.d/vxfen stop (if fencing is configured)gabconfig -Ulltconfig -U

Solaris Onlymodinfo | grep gabmodunload -i gab_idmodinfo | grep lltmodunload -i llt_idmodunload | grep vxfenmodinfo -i fen_ID

6 Physically remove cluster interconnect links from the system to be removed.

7 Remove VCS software from the system taken out of the cluster. For purposes of this lab, you do not need to remove the software because this system is put back in the cluster later.Notes: – You can either use the uninstallvcs script to automate the removal of

the VCS software or use the command specific to the operating platform, such as pkgadd for Solaris, swinstall for HP-UX, installp -a for AIX, or rpm for Linux, to remove the VCS software packages individually.

– If you have remote shell access (rsh or ssh) for root between the cluster systems, you can run uninstallvcs on any system in the cluster. Otherwise, you have to run the script on the system to be removed.

– You may need to manually remove configuration files and VCS directories that include customized scripts.

WARNING: When using the uninstallvcs script, you are prompted to remove software from all cluster systems. Do not accept the default of Y or you will inadvertently remove VCS from all cluster systems.

cd /opt/VRTSvcs/install./uninstallvcs


After the script completes, remove any remaining files related to VCS on train2:

rm /etc/vxfendgrm /etc/vxfentabrm /etc/llttabrm /etc/llthostsrm /etc/gabtabrm -r /opt/VRTSvcsrm -r /etc/VRTSvcs...


On the system remaining in the cluster, train1 in this example:

haconf -makerw

For all service groups that have train2 in their AutoStartList or SystemList:

hagrp -modify groupname AutoStartList –delete train2hagrp -modify groupname SystemList –delete train2


hasys -delete train2

When you have completed the modifications:

haconf -dump -makero

10 Modify the VCS communication configuration files on the remaining systems in the cluster to reflect the change.– Edit /etc/llthosts on all the systems remaining in the cluster (train1

in this example) to remove the line corresponding to the removed system (train2 in this example).

– Edit /etc/gabtab on all the systems remaining in the cluster (train1 in this example) to reduce the –n option to gabconfig by 1.


1

Note: You do not need to stop and restart LLT and GAB on the remaining systems when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.


Lab Exercise: Task 1—Removing a System from a Running ClusterComplete this exercise now, or at the end of the lesson, as directed by your instructor. One person from each team carries out the commands discussed in the classroom to accomplish Task 1.

For detailed lab steps and solutions for the classroom lab environment, see the following sections of Appendix A, B or C.“Task 1: Removing a System from a Running VCS Cluster,” page A-3“Task 1: Removing a System from a Running VCS Cluster,” page B-6“Task 1: Removing a System from a Running VCS Cluster,” page C-6

At the end of this lab exercise, you should end up with:• One system without any VCS software on it

Note: For purposes of the lab exercises, do not remove the VCS software.• A one-node cluster that is up and running with three service groups online• A two-node cluster that is up and running with three service groups online

This cluster should not be affected while performing Task 1 on the other cluster.

Lab Exercise: Task 1—Removing a System from a Running Cluster

Complete this exercise now or at the end of thelesson, as directed by your instructor.

One person from each team executes the commands discussed in the classroom to accomplish Task 1.See Appendix A, B, or C for detailed steps and classroom-specific information.

XUse the lab appendix best suited to your experience level:

Use the lab appendix best suited to your experience level:

Appendix A: Lab SynopsesAppendix B: Lab DetailsAppendix C: Lab Solutions



1

Task 2: Adding a New System to a Running VCS ClusterObjectiveThe objective of this task is to add a new system to a running VCS cluster with no or minimal impact on application services. Ensure that the cluster configuration is modified so that the application services can make use of the new system in the cluster.

AssumptionsTake these assumptions into account while planning a procedure for this task:• The VCS cluster consists of two or more systems, all of which are up and

running.• There are multiple service groups configured in the cluster. All of the service

groups are online somewhere in the cluster. • The new system to be added to the cluster does not have any VCS software.• The new system has the same version of operating system and VERITAS

Storage Foundation as the systems in the cluster.• The new system may not have all the required application software.• The storage devices can be connected to all systems.

Task 2: Adding a New System to a Running VCS Cluster

Objective Add a new system to a running VCS cluster while keeping the application services and VCS available and enabling the new system to run all of the application services.Assumptions– The cluster has two or more systems.– The new system does not have any VCS software.– The storage devices can be connected to all systems.

+


Procedure to Add a New System to a Running VCS ClusterDiscuss with your team or class the steps required to carry out Task 2. For each step, decide how the application services availability would be impacted. Note that there may not be a single answer to this question. Therefore, state your reasons for choosing a step in a specific order using the Notes area of your worksheet. Also, in the Notes area, document any assumptions that you are making that have not been explained as part of the task.


Classroom Discussion for Task 2Your instructor either groups students into teams orleads a class discussion for this task.For team-based exercises:



+


1

Procedure for Task 2 proposed by your team:


Notes


Use the following worksheet to document the procedure agreed upon by the class.



Notes


1

Solution to Class Discussion 2: Adding a System1 Install any necessary application software on the new system.2 Configure any application resources necessary to support clustered

applications on the new system.Note: The new system should be capable of running the application services in the cluster it is about to join. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access– Checking NFS major and minor numbers

3 Physically cable cluster interconnect links.Note: If the original cluster is a two-node cluster with crossover cables for the cluster interconnect, change to hubs or switches before you can add another node. Ensure that the cluster interconnect is not completely disconnected while you are carrying out the changes.

4 Install VCS. Notes: – You can either use the installvcs script with the -installonly

option to automate the installation of the VCS software or use the command specific to the operating platform, such as pkgadd for Solaris, swinstall for HP-UX, installp -a for AIX, or rpm for Linux, to install the VCS software packages individually.

– If you are installing packages manually:› Follow the package dependencies. For the correct order, refer to the

VERITAS Cluster Server Installation Guide.› After the packages are installed, license VCS on the new system using

the /opt/VRTSvcs/install/licensevcs command.a Start the installation.b Specify the name of the new system to the script (train2 in this example).c After the script has completed, create the communication configuration

files on the new system.5 Configure VCS communication modules (GAB, LLT) on the new system.6 Configure fencing on the new system, if used in the cluster.


7 Update VCS communication configuration (GAB, LLT) on the existing systems.Note: You do not need to stop and restart LLT and GAB on the existing systems in the cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.

8 Install any VCS Enterprise agents required on the new system. 9 Copy any triggers, custom agents, scripts, and so on from existing cluster

systems to the new cluster system. 10 Start cluster services on the new system and verify cluster membership.11 Update service group and resource configuration to use the new system.

Note: Service group attributes, such as SystemList, AutoStartList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.

12 Verify updates to the configuration by switching the application services to the new system.


1

Commands Required to Complete Task 2After you have agreed on the steps required to accomplish Task 2, you need to determine which VCS commands are required to perform each step in the procedure. You will first work as a team to propose a solution, and then discuss each step in the classroom. Note that there may be multiple methods to carry out each step.

You can use the participants guide, VCS manual pages, the VERITAS Cluster Server User’s Guide, and the VERITAS Cluster Server Installation Guide as sources of information. If there are topics that you do not understand well, ask your instructor to discuss them in detail during the classroom discussion.


VCS Commands Required for Task 2Provide the commands to perform each step in the recommended procedure for adding a system to a running VCS cluster.

You may need to refer to previous lessons, VCS manuals, or manual pages to decide on the specific commands and their options.For each step, complete the worksheet provided in the participants guide by providing the command, the system to run it on, and any specific notes.

+Note: When you are ready, your instructor will discuss each step in detail.Note: When you are ready, your instructor will discuss each step in detail.



Order of Execution


Notes


1



Order of Execution


Notes


Solution to Class Discussion 2: Commands for Adding a System1 Install any necessary application software on the new system.2 Configure any application resources necessary to support clustered

applications on the new system.Note: The new system should be capable of running the application services in the cluster it is about to join. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access– Checking NFS major and minor numbers

3 Physically cable cluster interconnect links.Note: If the original cluster is a two-node cluster with crossover cables for the cluster interconnect, change to hubs or switches before you can add another node. Ensure that the cluster interconnect is not completely disconnected while you are carrying out the changes.

4 Install VCS and configure VCS communication modules (GAB, LLT) on the new system. If you skipped the removal step in the previous section, you do not need to install VCS on this system.Notes: – You can either use the installvcs script with the -installonly




the /opt/VRTSvcs/install/licensevcs command.a Start the installation.

cd /install_location./installvcs -installonly

b Specify the name of the new system to the script (train2 in this example).


1

5 After the script completes, create the communication configuration files on the new system.

› /etc/llttabThis file should have the same cluster ID as the other systems in the cluster. This is the /etc/llttab file used in this example configuration:

set-cluster 2set-node train2link tag1 /dev/interface1:x - ether - -link tag2 /dev/interface2:x - ether - - link-lowpri tag3 /dev/interface3:x - ether - -

› /etc/llthostsThis file should contain a unique node number for each system in the cluster, and it should be the same on all systems in the cluster.This is the /etc/llthosts file used in this example configuration:

0 train31 train42 train2

› /etc/gabtab This file should contain the command to start GAB and any configured disk heartbeats.This is the /etc/gabtab file used in this example configuration:

› /sbin/gabconfig -c -n 3Note: The seed number used after the -n option shown previously should be equal to the total number of systems in the cluster.

6 Configure fencing on the new system, if used in the cluster.Create /etc/vxfendg and enter the coordinator disk group name.

7 Update VCS communication configuration (GAB, LLT) on the existing systems.Note: You do not need to stop and restart LLT and GAB on the existing systems in the cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.


a Edit /etc/llthosts on all the systems in the cluster (train3 and train4 in this example) to add an entry corresponding to the new system (train2 in this example).On train3 and train4:# vi /etc/llthosts0 train31 train42 train2

b Edit /etc/gabtab on all the systems in the cluster (train3 and train4 in this example) to increase the –n option to gabconfig by 1.On train3 and train4:# vi /etc/gabtab/sbin/gabconfig -c -n 3

8 Install any VCS Enterprise agents required on the new system. This example shows installing the Enterprise agent for Oracle.On train2:cd /install_dir

Solarispkgadd -d /install_dir VRTSvcsor

AIXinstallp -ac -d /install_dir/VRTSvcsor.rte.bff VRTSvcsor.rte

HP-UXswinstall -s /install_dir/pkgs VRTSvcsor

Linuxrpm -ihv VRTSvcsor-2.0-Linux.i386.rpm

9 Copy any triggers, custom agents, scripts, and so on from existing cluster systems to the new cluster system. Because this is a new system to be added to the cluster, you need to copy these trigger scripts to the new system.On the new system, train2 in this example:cd /opt/VRTSvcs/bin/triggersrcp train3:/opt/VRTSvcs/bin/triggers/* .


1

10 Start cluster services on the new system and verify cluster membership.On train2:lltconfig -cgabconfig -c -n 3gabconfig -aPort a membership should include the node ID for train2./etc/init.d/vxfen starthastartgabconfig -aBoth port a and port h memberships should include the node ID for train2.Note: You can also use LLT, GAB, and VCS startup files installed by the VCS packages to start cluster services.

11 Update service group and resource configuration to use the new system.Note: Service group attributes, such as SystemList, AutoStartList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.haconf -makerwFor all service groups in the vcs2 cluster, modify the SystemList and AutoStartList attributes:hagrp -modify groupname SystemList –add train2hagrp -modify groupname AutoStartList –add train2 priorityWhen you have completed modifications:haconf -dump -makero

12 Verify updates to the configuration by switching the application services to the new system.For all service groups in the vcs2 cluster:hagrp -switch groupname -to train2


Lab Exercise: Task 2—Adding a New System to a Running ClusterBefore starting the discussion about Task 3, one person from each team executes the commands discussed in the classroom to accomplish Task 2.

For detailed lab steps and solutions for the classroom lab environment, see the following sections of Appendix A, B, or C.“Task 2: Adding a System to a Running VCS Cluster,” page A-4“Task 2: Adding a System to a Running VCS Cluster,” page B-9“Task 2: Adding a System to a Running VCS Cluster,” page C-10

At the end of this lab exercise, you should end up with:• A one-node cluster that is up and running with three service groups online

There should be no changes in this cluster after Task 2.• A three-node cluster that is up and running with three service groups online

All the systems should be capable of running all the service groups after Task 2.

Lab Exercise: Task 2—Adding a New System to a Running Cluster



+


1

Task 3: Merging Two Running VCS ClustersObjectiveThe objective of this task is to merge two running VCS clusters with no or minimal impact on application services. Also, ensure that the cluster configuration is modified so that the application services can make use of the systems from both clusters.

AssumptionsFollowing is a list of assumptions that you need to take into account while planning a procedure for this task:• All the systems in both clusters are up and running.• There are multiple service groups configured in both clusters. All of the service

groups are online somewhere in the cluster. • All the systems have the same version of operating system and VERITAS

Storage Foundation.• The clusters do not necessarily have the same application services software.• New application software can be installed on the systems to support

application services of the other cluster.• The storage devices can be connected to all systems.• The cluster interconnects of both clusters are isolated before the merge.

For this example, you can assume that a one-node cluster is merged with a three-node cluster as in this lab environment.

Task 3: Merging Two Running VCS ClustersObjective Merge two running VCS clusters while maximizing application services and VCS availability.Assumptions– The storage devices can be connected to all systems.– You should enable all the application services to run on all the

systems in the cluster.– The private networks of both clusters are isolated before the merge.– All systems have the same version of OS and Storage Foundation.

+


Procedure to Merge Two VCS ClustersDiscuss with your team the steps required to carry out Task 3. For each step, decide how the application services availability would be impacted. Note that there may not be a single answer to this question. Therefore, state your reasons for choosing a step in a specific order using the Notes area of your worksheet. Also, in the Notes area, document any assumptions that you are making that have not been explained as part of the task.


Classroom Discussion for Task 3


Your instructor either groups students into teams orleads a class discussion for this task.For team-based exercises:


+


1

Procedure for Task 3 proposed by your team:


Notes


Use the following worksheet to document the procedure agreed upon by the class.



Notes


1

Solution to Class Discussion 3: Merging Two Running ClustersIn the following steps, it is assumed that the small (first) cluster is merged to the larger (second) cluster. That is, the merged cluster keeps the name and ID of the second cluster, and the second cluster is not brought down during the whole process.1 Modify VCS communication files on the second cluster to recognize the

systems to be added from the first cluster.Note: You do not need to stop and restart LLT and GAB on the existing systems in the second cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.

2 Add the names of the systems in the first cluster to the second cluster.3 Install and configure any additional application software required to support

the merged configuration on all systems.Notes: – Installing applications in a VCS cluster would require freezing systems.

This step may also involve switching application services and rebooting systems depending on the application installed.

– All the systems should be capable of running the application services when the clusters are merged. Preparing application resources may include:› Creating user accounts› Copying application configuration files› Creating mount points› Verifying shared storage access

4 Install any additional VCS Enterprise agents on each system.Note: Enterprise agents should only be installed, not configured.

5 Copy any additional custom agents to all systems.Note: Custom agents should only be installed, not configured.

6 Extract service group configuration from the small cluster, so you can add it to the larger cluster configuration without stopping VCS.

7 Copy or merge any existing trigger scripts on all systems.Notes: – The extent of this step depends on the contents of the trigger scripts.

Because the trigger scripts are in use on the existing cluster systems, it is recommended to merge the scripts on a temporary directory.

– Depending on the changes required, it may be necessary to stop cluster services on the systems before copying the merged trigger scripts.


8 Stop cluster services (VCS, fencing, GAB, and LLT) on the systems in the first cluster.Note: Leave application services running on the systems.

9 Reconfigure VCS communication modules on the systems in the first cluster and physically connect cluster interconnects.

10 Start cluster services (LLT, GAB, fencing, and VCS) on the systems in the first cluster and verify cluster memberships.

11 Update service group and resource configuration to use all the systems.Note: Service group attributes, such as AutoStartList, SystemList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.

12 Verify updates to the configuration by switching application services between the systems in the merged cluster.


1

Commands Required to Complete Task 3After you have agreed on the steps required to accomplish Task 3, determine the VCS commands required to perform each step in the procedure. You will first work as a team to propose a solution, and then discuss each step in the classroom. Note that there may be multiple methods to carry out each step.

You can use the participants guide, VCS manual pages, the VERITAS Cluster Server User’s Guide, and the VERITAS Cluster Server Installation Guide as sources of information. If there are topics that you do not understand, ask your instructor to discuss them in detail during the classroom discussion.


VCS Commands Required for Task 3Provide the commands to perform each step in the recommended procedure for merging two VCS clusters.

You may need to refer to previous lessons, VCS manuals, or manual pages to decide on the specific commands and their options.For each step, complete the worksheet provided in the participants guide, providing the command, the system to run it on, and any specific notes.

+Note: When you are ready, your instructor will discuss each step in detail.Note: When you are ready, your instructor will discuss each step in detail.



Order of Execution


Notes


1



Order of Execution


Notes


Solution to Class Discussion 3: Commands to Merge ClustersIn the following steps, it is assumed that the first cluster is merged to the second; that is, the merged cluster keeps the name and ID of the second cluster, and the second cluster is not brought down during the whole process.1 Modify VCS communication files on the second cluster to recognize the

systems to be added from the first cluster.Note: You do not need to stop and restart LLT and GAB on the existing systems in the second cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_id_range– exclude system_id_range– set-addr systemid tag addressFor more information on these directives, check the VCS manual pages on llttab.– Edit /etc/llthosts on all the systems in the second cluster to add

entries corresponding to the new systems from the first cluster.On train2, train3, and train4:vi /etc/llthosts0 train41 train32 train23 train1

– Edit /etc/gabtab on all the systems in the second cluster to increase the –n option to gabconfig by the number of systems in the first cluster.On train2, train3, and train4:vi /etc/gabtab/sbin/gabconfig -c -n 4

2 Add the names of the systems in the first cluster to the second cluster.haconf -makerwhasys -add train1hasys -add train2haconf -dump -makero


1

3 Install and configure any additional application software required to support the merged configuration on all systems.Notes: – Installing applications in a VCS cluster would require freezing systems.

This step may also involve switching application services and rebooting systems depending on the application installed.

– All the systems should be capable of running the application services when the clusters are merged. Preparing application resources may include:› Creating user accounts› Copying application configuration files› Creating mount points› Verifying shared storage access

4 Install any additional VCS Enterprise agents on each system.Note: Enterprise agents should only be installed, not configured.

5 Copy any additional custom agents to all systems.Note: Custom agents should only be installed, not configured.

6 Extract service group configuration from the first cluster and add it to the second cluster configuration.a On the first cluster, vcs1 in this example, create a main.cmd file.

hacf -cftocmd /etc/VRTSvcs/conf/configb Edit the main.cmd file and filter the commands related with service group

configuration. Note that you do not need to have the commands related to the ClusterService and NetworkSG service groups because these already exist in the second cluster.

c Copy the filtered main.cmd file to a running system in the second cluster, for example, to train3.

d On the system in the second cluster where you copied the main.cmd file, train3 in vcs2 in this example, open the configuration.haconf -makerw

e Execute the filtered main.cmd file.sh main.cmd

Note: Any customized resource type attributes in the first cluster are not included in this procedure and may require special consideration before adding them to the second cluster configuration.

7 Copy or merge any existing trigger scripts on all systems.Notes: – The extent of this step depends on the contents of the trigger scripts.

Because the trigger scripts are in use on the existing cluster systems, it is recommended to merge the scripts on a temporary directory.

– Depending on the changes required, it may be necessary to stop cluster services on the systems before copying the merged trigger scripts.


8 Stop cluster services (VCS, fencing, GAB, and LLT) on the systems in the first cluster.Note: Leave application services running on the systems.a On one system in the first cluster (train1 in vcs1 in this example), stop

VCS.hastop -all -force

b On all the systems in the first cluster (train1 in vcs1 in this example), stop fencing, and then stop GAB and LLT./etc/init.d/vxfen stopgabconfig -Ulltconfig -U

9 Reconfigure VCS communication modules on the systems in the first cluster and physically connect cluster interconnects.On all the systems in the first cluster (train1 in vcs1 in this example):a Edit /etc/llttab and modify the cluster ID to be the same as the

second cluster.# vi /etc/llttabset-cluster 2set-node train1link interface1 /dev/interface1:0 - ether - -link interface2 /dev/interface2:0 - ether - - link-lowpri interface2 /dev/interface2:0 - ether - -

b Edit /etc/llthosts and ensure that there is a unique entry for all systems in the combined cluster.# vi /etc/llthosts0 train41 train32 train23 train1

c Edit /etc/gabtab and modify the –n option to gabconfig to reflect the total number of systems in combined clusters.vi /etc/gabtab/sbin/gabconfig -c -n 4


1

10 Start cluster services (LLT, GAB, fencing, and VCS) on the systems in the first cluster and verify cluster memberships.On train1:lltconfig -cgabconfig -c -n 4gabconfig -aThe port a membership should include the node ID for train1, in addition to the node IDs for train2, train3, and train4./etc/init.d/vxfen starthastartgabconfig -aBoth port a and port h memberships should include the node ID for train1 in addition to the node IDs for train2, train3, and train4.Note: You can also use LLT, GAB, and VCS startup files installed by the VCS packages to start cluster services.

11 Update service group and resource configuration to use all the systems.Note: Service group attributes, such as AutoStartList, SystemList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.a Open the cluster configuration.

haconf -makerwb For the service groups copied from the first cluster, add train2, train3, and

train4 to the SystemList and AutoStartList attributes:hagrp -modify groupname SystemList -add train2 \priority2 train3 priority3 train4 priority4hagrp -modify groupname AutoStartList add train2 \train3 train4

c For the service groups that existed in the second cluster before the merging, add train1 to the SystemList and AutoStartList attributes:hagrp -modify groupname SystemList -add train1 \ priority1hagrp -modify groupname AutoStartList add train1

d Close and save the cluster configuration.haconf -dump -makero

12 Verify updates to the configuration by switching application services between the systems in the merged cluster.For all the systems and service groups in the merged cluster, verify operation:hagrp –switch groupname –to systemname


Lab Exercise: Task 3—Merging Two Running VCS ClustersTo complete the workshop, one person from each team executes the commands discussed in the classroom to accomplish Task 3.

For detailed lab steps and solutions for the classroom lab environment, see the following sections of Appendix A, B, or C.“Task 3: Merging Two Running VCS Clusters,” page A-5“Task 3: Merging Two Running VCS Clusters,” page B-13“Task 3: Merging Two Running VCS Clusters,” page C-16

At the end of this lab exercise, you should have a four-node cluster that is up and running with six application service groups online. All the systems should be capable of running all the application services after Task 3 is completed.

Lab Exercise: Task 3—Merging Two Running VCS Clusters



+


1

SummaryThis workshop introduced procedures to add and remove systems to and from a running VCS cluster and to merge two VCS clusters. In doing so, this workshop reviewed the concepts related to how VCS operates, how the configuration changes in VCS communications, and how the cluster configuration impacts the application services’ availability.

Next StepsThe next lesson describes how the relationships between application services can be controlled under VCS in a multinode and multiple application services environment. This lesson also shows the impact of these controls during service group failovers.

Additional Resources• VERITAS Cluster Server Installation Guide

This guide provides information on how to install VERITAS Cluster Server (VCS) on the specified platform.

• VERITAS Cluster Server User’s GuideThis document provides information about all aspects of VCS configuration.

Lesson SummaryKey Points – You can minimize downtime when

reconfiguring cluster members.– Use the procedures in this lesson as

guidelines for adding or removing cluster systems.

Reference Materials– VERITAS Cluster Server Installation Guide– VERITAS Cluster Server User's Guide


Lab 1: Reconfiguring Cluster MembershipYou instructor may choose to have you complete the exercises as a single lab.

Labs and solutions for this lesson are located on the following pages.

Appendix A provides brief lab instructions for experienced students.• “Lab 1 Synopsis: Reconfiguring Cluster Membership,” page A-2

Appendix B provides step-by-step lab instructions.• “Lab 1 Details: Reconfiguring Cluster Membership,” page B-3

Appendix C provides complete lab instructions and solutions.• “Lab Solution 1: Reconfiguring Cluster Membership,” page C-3

Lab 1: Reconfiguring Cluster Membership

BA A

B BA

DC C

DC

C DCD

B

B

C DD

1 2

3 4 3 4

4

2

2

2

1

1 3DC

BBC

D

AA

Task 1

Task 2

Task 3

D

AC AUse the lab appendix best

suited to your experience level:Use the lab appendix best suited to your experience level:



Lesson 2Service Group Interactions


IntroductionOverviewThis lesson describes how to configure VCS to control the interactions between application services. In this lesson, you learn how to implement service group dependencies and use resources and triggers to control the startup and failover behavior of service groups.

ImportanceIn order to effectively implement dependencies between applications in your cluster, you need to use a methodology for translating application requirements to VCS service group dependency rules. By analyzing and implementing service group dependencies, you can factor performance, security, and organizational requirements into your cluster environment.

Lesson Introduction


Lesson 2 Service Group Interactions 2–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

2

Outline of Topics• Common Application Relationships• Service Group Dependency Definition• Service Group Dependency Examples• Configuring Service Group Dependencies• Alternative Methods of Controlling Interactions

Configure alternative methods for controlling service group interactions.

Alternative Methods of Controlling Interactions

Configure service group dependencies.Configuring Service Group Dependencies

Describe example uses of service group dependencies.

Service Group Dependency Examples

Define service group dependencies.Service Group Dependency Definition

Describe common example application relationships.

Common Application Relationships


Topic



Common Application RelationshipsSeveral examples of application relationships are shown to illustrate common scenarios where service group dependencies are useful for managing services.

Online on the Same SystemIn this type of relationship, services must run on the same system due to some set of constraints. In the example in the slide, App1 and DB1 communicate using shared memory and therefore must run on the same system. If a fault occurs, they must both be moved to the same system.

Online on the Same System

Example criteria:App1 uses shared memory to communicate with DB1.Both must be online on the same system to provide the service.DB1 must come online first.If either faults (or the system), they must fail over to the same system.

App1App1

DB1DB1


2

Online Anywhere in the ClusterThis example shows an application and database that must be running somewhere in the cluster in order to provide a service. They do not need to run on the same system, but they can, if necessary. For example, if multiple servers were down, DB2 and App2 could run on the remaining server.

Online Anywhere in the Cluster

Example criteria: App2 communicates with DB2 using TCP/IP.Both must be online to provide the service.They do not have to be online on the same system.DB2 must be running before App2 starts.

App2App2

DB2DB2


Online on Different SystemsIn this example, both the database and the Web server must be online, but they cannot run on the same system. For example, the combined resource requirements of each application may exceed the capacity of the systems, and you want to ensure that they run on separate systems.

WebWeb

DB3DB3

Online on Different Systems

Example criteria:The Web server requires DB3 to be online first.Both must be online to provide the service.The Web and DB3 cannot run on the same system, due to system usage constraints.If Web faults, DB3 should continue to run.


2

Offline on the Same SystemOne example relationship is where you have a test version of an application and want to ensure that it does not interfere with the production version. You want to give the production application precedence over the test version for all operations, including manual offline, online, switch, and failover.

Offline on the Same System

Example criteria:One node is used for a test version of the service.Test and Prod cannot be online on the same system.Prod always has priority. Test should be shut down if Prod faults and needs to fail over to that system.

TestTest

ProdProd


Service Group Dependency DefinitionYou can set up dependencies between service groups to enforce rules for how VCS manages relationships between application services.

There are four basic criteria for defining how services interact when using service group dependencies.• A service group can require another group to be online or offline in order to

start and run.• You can specify where the groups must be online or offline.• You can determine the startup order for service groups by designating one

group the child (comes online first) and another a parent. In VCS, parent groups depend on child groups. If service group B requires service group A to be online in order to start then B is the parent and A is the child.

• Failover behavior of linked service groups is specified by designating the relationship soft, firm, or hard. These types determine what happens when a fault occurs in the parent or child group.

Startup Behavior SummaryFor all online dependencies, the child group must be online in order for the parent to start. A location of local, global, or remote determines where the parent can come online relative to where the child is online.

For offline local, the child group must be offline on the local system for the parent to come online.

Service Group DependenciesYou can use service group dependencies to specify most application relationships according to these four criteria:– Category: Online or offline– Location: Local, remote, or global– Startup behavior: Parent or child– Failover behavior: Soft, firm, or hard

You can specify combinations of these characteristics to determine how dependencies affect service group behavior, as shown in a series of examples in this lesson.


2

Failover Behavior SummaryThese general properties apply to failover behavior for linked service groups:• Target systems are determined by the system list of the service group and the

failover policy in a way that should not conflict with the existing service group dependencies.

• If a target system exists, but there is a dependency violation between the service group and a parent service group, the parent service group is migrated to another system to accommodate the child service group that is failing over.

• If conflicts between a child service group and a parent service group arise, the child service group is given priority.

• If there is no system available for failover, the service group remains offline, and no further attempt is made to bring it online.

• If the parent service group faults and fails over, the child service group is not taken offline or failed over except for online local hard dependencies.

Examples are provided in the next section. A complete description of both failover behavior and manual operations for each type of dependency is provided in the job aid.

Failover Behavior SummaryTypes apply to online dependencies and define online,offline, and failover operations:

Soft: The parent can stay online when the child faults.

Firm: – The parent must be taken offline when the child faults. – When the child is brought online on another system,

the parent is brought online.Hard:– The child and parent fail over together to the same

system when either the child or the parent faults.– Hard applies only to an online local dependency. – This is allowed only between a single parent and a

single child.


Service Group Dependency ExamplesA set of animations are used to show how service group dependencies affect failover when different kinds of faults occur.

The following sections provide illustrations and summaries of these examples. A complete description of startup and failover behavior for each type of dependency is provided as a job aid in Appendix D.

Online Local DependencyIn an online local dependency, a child service group must be online on a system before a parent service group can come online on the same system.

Online Local Soft

A link configured as online local soft designates that the parent group stays online while the child group fails over, and then migrates to follow the child.• Online Local Soft: The child faults.

Failover behavior examples:Firm:– Child faults: Parent follows

child– Parent faults: Child

continues to runHard: Same as Firm except when parent faults:– Child is failed over– Parent then started on the

same system

Online Local Dependency

App1App1

DB1DB1

Startup behavior:Child must be online Parent can come online on only on the same system

AnimationSlides


2

If a child group in an online local soft dependency faults, the parent service group is migrated to another system only after the child group successfully fails over to that system. If the child group cannot fail over, the parent group is left online.

• Online Local Soft: The parent faults.

If the parent group in an online local soft dependency faults, it stays offline, and the child group remains online.

Online Local Firm

A link configured as online local firm designates that the parent group is taken offline when the child group faults. After the child group fails over, the parent is migrated to that system.• Online Local Firm: The child faults.

If a child group in an online local firm dependency faults, the parent service group is taken offline on that system. The child group fails over and comes online on another system. The parent group is then started on the system where the child group is now running. If the child group cannot fail over, the parent group is taken offline and stays offline.


• Online Local Firm: The parent faults.

If a parent group in an online local firm dependency faults, the parent service group is taken offline and stays offline.

• Online Local Firm: The system faults.

If a system faults, the child group in an online local firm dependency fails over to another system, and the parent is brought online on the same system.


2

Online Local Hard

Starting with VCS 4.0, online local dependencies can also be formed as hard dependencies. A hard dependency indicates that the child and the parent service groups fail over together to the same system when either the child or the parent faults. Prior to VCS 4.0, trigger scripts had to be used to cause a fault in the parent service group to initiate a failover of the child service group. With the introduction of hard dependencies, there is no longer a need to use triggers for this purpose. Hard dependencies are allowed only between a single parent and a single child.• Online Local Hard: The child faults.

If the child group in an online local hard dependency faults, the parent group is taken offline. The child is failed over to an available system. The parent group is then started on the system where the child group is running. The parent service group remains offline if the parent service group cannot fail over.

• Online Local Hard: The parent faults.

If the parent service group in an online local hard dependency faults, the child group is failed over to another system. The parent group is then started on the system where the child group is running. The child service group remains online if the parent service group cannot fail over.


Online Global DependencyIn an online global dependency, a child service group must be online on a system before the parent service group can come online on any system in the cluster, including the system where the child is running.

Online Global Soft

A link configured as online global soft designates that the parent service group remains online when the child service group faults. The issue of whether the child service group can fail over to another system or not does not impact the parent service group.• Online Global Soft: The child faults.

If the child group in an online global soft dependency faults, the parent continues to run on the original system, and the child fails over to an available system.

• Online Global Soft: The parent faults.If the parent group in an online global soft dependency faults, the child continues to run on the original system, and the parent fails over to an available system.

App2App2

DB3DB3

Online Global Dependency

Failover behavior example foronline global firm:

Child faults and is taken offlineParent group is taken offlineChild fails over to an available systemParent restarts on an available system

Startup behavior:Child must be onlineParent can come online on any system

AnimationSlides


2

Online Global Firm

A link configured as online global firm designates that the parent service group is taken offline when the child service group faults. When the child service group fails over to another system, the parent is migrated to an available system. The child and parent can be running on the same or different systems after the failover.• Online Global Firm: The child faults.

The child faults and is taken offline. The parent group is taken offline. The child fails over to an available system, and the parent fails over to an available system.

• Online Global Firm: The parent faults.If the parent group in an online global firm dependency faults, the child continues to run on the original system, and the parent fails over to an available system.


Online Remote DependencyIn an online remote dependency, a child service group must be online on a remote system before the parent service group can come online on the local system.

Online Remote Soft

An online remote soft dependency designates that the parent service group remains online when the child service group faults, as long as the child service group chooses another system to fail over to. If the child service group chooses to fail over to the system where the parent was online, the parent service group is migrated to any other available system.

WebWeb

DB3DB3

Online Remote Dependency

Startup behavior:Child must be onlineParent can come online only on a remote system

Failover behavior example foronline remote soft:

The child faults and fails over to an available system.If the only available system is where the parent is online, the parent is taken offline before the child is brought online. The parent then restarts on a system different than the child. Otherwise, the parent continues to run on the original system.

AnimationSlides


2

• Online Remote Soft: The child faults.

The child group faults and fails over to an available system. If the only available system has the parent running, the parent is taken offline before the child is brought online. The parent then restarts on a different system. If the parent is online on a system that is not selected for child group failover, the parent continues to run on the original system.

• Online Remote Soft: The parent faults.The parent group faults and is taken offline. The child group continues to run on the original system. The parent group fails over to an available system. If the only available system is running the child group, the parent stays offline.

Online Remote Firm

A link configured as online remote firm is similar to online global firm, with the exception that the parent service group is brought online on any system other than the system on which the child service group was brought online.• Online Remote Firm: The child faults.

The child group faults and is taken offline. The parent group is taken offline. The child fails over to an available system. If the child fails over to the system where the parent was online, the parent restarts on a different system; otherwise, the parent restarts on the system where it was online.

• Online Remote Firm: The parent faults.The parent group faults and is taken offline. The child group continues to run on the original system. The parent fails over to an available system. If the only available system is where the child is online, the parent stays offline.


Offline Local DependencyIn an offline local dependency, the parent service group can be started only if the child service group is offline on the local system. Similarly, the child can only be started if the parent is offline on the local system. This prevents conflicting applications from running on the same system. • Offline Local Dependency: The child faults.

The child group faults and fails over to an available system. If the only available system is where the parent is online, the parent is taken offline before the child is brought online. The parent then restarts on a system different than the child’s system. Otherwise, the parent continues to run on the original system.

• Offline Local Dependency: The parent faults.The parent faults and is taken offline. The child continues to run on the original system. The parent fails over to an available system where the child is offline. If the only available system has the child is online, the parent stays offline.

Offline Local Dependency

TestTest

ProdProd

Startup behavior:Child can come online anywhere the parent is offlineParent can come online only where child is offline

Failover behavior example when the child faults:

The child fails over to an available system.If the only available system is where the parent is online, the parent is taken offline before the child is brought online.The parent then restarts on a system different than the child; otherwise, the parent continues to run.

AnimationSlides


2

Configuring Service Group DependenciesService Group Dependency RulesYou can use service group dependencies to implement parent/child relationships between applications. Before using service group dependencies to implement the relationships between multiple application services, you need to have a good understanding of the rules governing these dependencies: • Service groups can have multiple parent service groups.

This means that an application service can have multiple other application services depending on it.

• A service group can have only one child service group. This means that an application service can be dependent on only one other application service.

• A group dependency tree can be no more than three levels deep.• Service groups cannot have cyclical dependencies.

Service Group Dependency RulesThese rules determine how youspecify dependencies:

Child has priorityMultiple parentsOnly one childMaximum of three levelsNo cyclical dependencies


Creating Service Group DependenciesYou can create service group dependencies from the command-line interface using the hagrp command or through the Cluster Manager. To create a dependency, link the groups and specify the relationship (dependency) type, indicating whether it is soft, firm, or hard.

If not specified, service group dependencies are firm by default.

To configure service group dependencies using the Cluster Manager, you can either right-click the parent service group and select Link to display the Link Service Groups view that is shown on the slide, or you can use the Service Group View.

Removing Service Group DependenciesYou can remove service group dependencies from the command-line interface (CLI) or the Cluster Manager. You do not need to specify the type of dependency while removing it, because only one dependency is allowed between two service groups.

Creating Service Group Dependencies

hagrp –link Parent Child online local firmhagrp –link Parent Child online local firm

Group G1 (…)…

requires group G2 online local firm

…

Group G1 (…)…

requires group G2 online local firm

…main.cfmain.cf

Resource dependencies

Resource definitions

Service group attributes


2

Alternative Methods of Controlling InteractionsLimitations of Service Group DependenciesThe example scenario described in the slide cannot be implemented using only service group dependencies. You cannot create a link from the application service group to the NFS service group if you have a link from the application service to the database, because a parent service group can only have one child.

When service group dependency rules prevent you from implementing the types of dependencies that you require in your cluster environment, you can use resources or triggers to define relationships between service groups.

Limitations of Service Group DependenciesConsider these requirements:

These services need to be online at the same time:

– App needs DB to be online.– Web needs NFS to be online.

These services should not be online on the same system at the same time:

– Application and database– Application and NFS service

NFSDB

App Web

OnlineGlobal

OfflineLocal

Online Remote

The App service group cannot have two child service groups.

The App service group cannot have two child service groups.!


Using Resources to Control Service Group InteractionsAnother method for controlling the interactions between service groups is to configure special resources that indicate whether the service group is online or offline on a system.

VCS provides several resource types, such as FileOnOff and ElifNone, that can be used to create dependencies.

This example demonstrates how resources can be used to prevent service groups from coming online on the same system:• S1 has a service group, App, which contains an ElifNone resource. An

ElifNone resource is considered online only if the specified file is absent. In this case, the ElifNone resource is online only if /tmp/NFSon does not exist.

• S2 has a service group, NFS, which contains a FileOnOff resource. This resource creates the /tmp/NFSon file when it is brought online.

• Both the ElifNone and FileOnOff resources are critical, and all other resources in the respective service groups are dependent on them. If the resources fault, the service group fails over.

When operating on different systems, each service group can be online at the same time, because these resources have no interactions.

Using Resources to Control Service Group Interactions

S1 S2

R3

R4

FileOnOff

/tmp /tmp

NFSon

R1

R2

ElifNone

App App

NFSNFS

ElifNoneX

ElifNone


2

If NFS fails over to S1, /tmp/NFSon is created on S1 when the FileOnOff resource is brought online.

The ElifNone resource faults when it detects the presence of /tmp/NFSon. Because this resource is critical and all other resources are parent (dependent) resources, App is taken offline.

Make the MonitorInterval and the OfflineMonitorInterval short (about five to ten seconds) for the ElifNone resource type. This enables the parent service group to fail over to the empty system in a timely manner. The fault is cleared on the ElifNone resource when it is monitored, because this is a persistent resource. Faulted resources are monitored periodically according to the value of the OfflineMonitorInterval attribute.

Example of Offline Local Dependency Using Resources

S1 S2

R3

R4

FileOnOff

/tmp /tmp

NFSon

App App

NFSNFS

R1

R2

ElifNone

App

ElifNone

ElifNoneX


Using Triggers to Control Service Group InteractionsVCS provides several event triggers that can be used to enforce service group relationships, including:• PreOnline: VCS runs the preonline script before bringing a service group

online.The PreOnline trigger must be enabled for each applicable service group by setting the PreOnline service group attribute. For example, to enable the PreOnline trigger for GroupA, type:hagrp -modify GroupB PreOnline 1

• PostOnline: The postonline script is run after a service group is brought online.

• PostOffline: The postoffline script is run after a service group is taken offline.

PostOnline and PostOffline are enabled automatically if the script is present in the $VCS_HOME/bin/triggers directory. Be sure to copy triggers to all systems in the cluster. When present, these triggers apply to all service groups.

Consider implementing triggers only after investigating whether VCS native facilities can be used to configure the desired behavior. Triggers add complexity, requiring programming skills as opposed to simply configuring VCS objects and attributes.

Using Triggers to Control Service Group Interactions

PreOnlineRuns the preonline script before bringing the service group onlinePostOnlineRuns the postonline script after bringing a service group onlinePostOfflineRuns the postoffline script after taking a service group offline


2

SummaryThis lesson covered service group dependencies. In this lesson, you learned how to translate business rules to VCS service group dependency rules. You also learned how to implement service group dependencies with resources and triggers.

Next StepsThe next lesson introduces failover policies and discusses how VCS chooses a failover target.

Additional Resources• VERITAS Cluster Server User’s Guide

This document describes VCS service group dependency types and rules. This guide also provides detailed descriptions of resources and triggers, in addition to information about service groups and failover behavior.

• Appendix D, “Job Aids”This appendix includes a table containing a complete description of service group behavior for each dependency case.

Lesson SummaryKey Points – You can use service group dependencies to

control interactions among applications.– You can also use triggers and specialized

resources to manage application relationships.Reference Materials– VERITAS Cluster Server User's Guide– Appendix D, "Job Aids"


Lab 2: Service Group DependenciesLabs and solutions for this lesson are located on the following pages.

Appendix A provides brief lab instructions for experienced students.• “Lab 2 Synopsis: Service Group Dependencies,” page A-7

Appendix B provides step-by-step lab instructions.• “Lab 2 Details: Service Group Dependencies,” page B-17

Appendix C provides complete lab instructions and solutions.• “Lab 2 Solution: Service Group Dependencies,” page C-25

GoalThe purpose of this lab is to configure service group dependencies and observe the effects on manual and failover operations.

ResultsEach student’s service groups have been configured in a series of service group dependencies. After completing the testing, the dependencies are removed, and each student’s service groups should be running on their own system.

PrerequisitesObtain any classroom-specific values needed for your classroom lab environment and record these values in your design worksheet included with the lab exercise instructions.

Lab 2: Service Group Dependencies

ParentParent

ChildChild

OnlineLocal

OnlineLocal

OnlineGlobalOnlineGlobal

OfflineLocal

OfflineLocal

nameSG2

nameSG1



Lesson 3Workload Management


IntroductionOverviewThis lesson describes in detail the Service Group Workload Management (SGWM) feature used for choosing a system to run a service group both at startup and during a failover. SGWM enables system administrators to control where the service groups are started in a multinode cluster environment.

ImportanceUnderstanding and controlling how VCS chooses a system to start up a service group and select a failover target when it detects a fault is crucial in designing and configuring multinode clusters with multiple application services.

Lesson Introduction


Lesson 3 Workload Management 3–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

3

Outline of Topics• Startup Rules and Policies• Failover Rules and Policies • Controlling Overloaded Systems• Additional Startup and Failover Controls• Configuring Startup and Failover Policies• Using the Simulator

Apply additional controls for startup and failover.

Additional Startup and Failover Controls

Use the Simulator to model workload management.

Using the Simulator

Configure startup and failover policies.Configuring Startup and Failover Policies

Configure policies to control overloaded systems.

Controlling Overloaded Systems

Describe the rules and policies for service group failover.

Failover Rules and Policies

Describe the rules and policies for service group startup.

Startup Rules and Policies


Topic



Startup Rules and PoliciesRules for Automatic Service Group StartupThe following conditions should be satisfied for a service group to be automatically started:• The service group AutoStart attribute must be set to the default value of 1. If

this attribute is changed to 0, VCS leaves the service group offline and waits for an administrative command to be issued to bring the service group online.

• The service group definition must have at least one system in its AutoStartList attribute.

• All of the systems in the service group’s SystemList must be in RUNNING state so that the service group can be probed on all systems on which it can run. If there are systems on which the service group can run that have not joined the cluster yet, VCS autodisables the service group until it is probed on all the systems.

The startup system for the service group is chosen as follows:1 A subset of systems included in the AutoStartList attribute are selected.

a Frozen systems are eliminated.b Systems where the service group has a FAULTED status are eliminated.c Systems that do not meet the service group requirements are eliminated, as

described in detail later in the lesson.2 The target system is chosen from this list based on the startup policy defined

for the service group.

Rules for Automatic Service Group StartupThe service group must have its AutoStart attribute set to 1 (default value).The service group must have a nonempty AutoStartList attribute consisting of the systems where it can be started.All the systems that the service group can run on must be up and running.The startup system is selected as follows:– A subset of systems that meet the service group

requirements from among the systems in the AutoStartList is created first (described later in detail).

– Frozen systems and systems where the service group has a FAULTED status are eliminated from the list.

– The target system is selected based on the startup policy of the service group.


3

Automatic Startup PoliciesYou can set the AutoStartPolicy attribute of a service group to one of these three values:• Order: Systems are chosen in the order in which they are defined in the

AutoStartList attribute. This is the default policy for every service group.• Priority: The system with the lowest priority number in SystemList is selected.

Note that this system should also be listed in AutoStartList.• Load: The system with the highest available capacity is selected.

These policies are described in more detail in the following pages.

To configure the AutoStartPolicy attribute of a service group, execute:hagrp -modify groupname AutoStartPolicy policy

where possible values for policy are Order, Priority, and Load. You can also set this attribute using the Cluster Manager GUI.

Note: The configuration must be open to change service group attributes.

Automatic Startup PoliciesThe AutoStartPolicy attribute specifies how a target system is selected:– Order: The first available system according to the

order in AutoStartList is selected (default).– Priority: The system with the lowest priority

number in SystemList is selected.– Load: The system with the greatest available

capacity is selected.Example configuration:hagrp –modify groupname AutoStartPolicy Load

Detailed examples are provided on the next set of pages.


AutoStartPolicy=Order

When the AutoStartPolicy attribute of a service group is set to the default value of Order, the first system available in AutoStartList is selected to bring the service group online. The priority numbers in SystemList are ignored.

In the example shown on the slide, the AP1 service group is brought online on SVR1, although it is the system with the highest priority number in SystemList. Similarly, the AP2 service group is brought online on SVR2, and the DB service group is brought online on SVR3 because these are the first systems listed in the AutoStartList attributes of the corresponding service groups.

Note: Because Order is the default value for the AutoStartPolicy attribute, it is not required to be listed in the service group definitions in the main.cf file.

AutoStartPolicy=Order

The first available system in AutoStartList is selected.The first available system in AutoStartList is selected.

Animation


3

AutoStartPolicy=Priority

When the AutoStartPolicy attribute of a service group is set to Priority, the system with the lowest priority number in the SystemList that also appears in the AutoStartList is selected as the target system during start-up. In this case, the order of systems in the AutoStartList is ignored.

The same example service groups are now modified to use the Priority AutoStartPolicy, as shown on the slide. In this example, the AP1 service group is brought online on SVR3, which has the lowest priority number in SystemList, although it appears as the last system in AutoStartList. Similarly, the AP2 service group is brought online on SVR1 (with priority number 0), and the DB service group is brought online on SVR2 (with priority number 1).

Note how the startup systems have changed for the service groups by changing AutoStartPolicy, although the SystemList and AutoStartList attributes are the same for these two examples.

AutoStartPolicy=Priority

The lowest-numbered system in SystemList is selected.


Animation


AutoStartPolicy=Load

When AutoStartPolicy is set to Load, VCS determines the target system based on the existing workload of each system listed in the AutoStartList attribute and the load that is added by the service group.

These attributes control load-based start-up:• Capacity is a user-defined system attribute that contains a value representing

the total amount of load that the system can handle.• Load is a user-defined service group attribute that defines the amount of

capacity required to run the service group.• AvailableCapacity is a system attribute maintained by VCS that quantifies the

remaining available system load.

In the example displayed on the slide, the design criteria specifies that three servers have Capacity set to 300. SRV1 is selected as the target system for starting SG4 because it has the highest AvailableCapacity value of 200.

Determine Load and Capacity

You must determine a value for Load for each service group. This value is based on how much of the system capacity is required to run the application service that is managed by the service group.

When a service group is brought online, the value of its Load attribute is subtracted from the system Capacity value, and AvailableCapacity is updated to reflect the difference.

AutoStartPolicy=Load

The system with the greatest AvailableCapacityvalue is selected.

The system with the greatest AvailableCapacityvalue is selected.

Animation


3

Note: Both the Capacity attribute of a system and the Load attribute of a service group are static user-defined attributes based on your design criteria.

How a Service Group Starts Up

When the cluster initially starts up, the following events take place with service groups using Load AutoStartPolicy:1 Service groups are placed in an AutoStart queue in the order that probing is

completed for each service group. Decisions for each service group are made serially, but the actual startup of service groups takes place in parallel.

2 For each service group in the AutoStart queue, VCS selects a subset of potential systems from the AutoStartList, as follows:a Frozen systems are eliminated.b Systems where the service group has a FAULTED status are eliminated.c Systems that do not meet the service group requirements are eliminated.

This topic is explained in detail later in the lesson.3 From this list, the target system with the highest value for AvailableCapacity is

chosen. If there are multiple systems with the same AvailableCapacity, the first one canonically is selected.

4 VCS then recalculates the new AvailableCapacity value for that target system by subtracting the Load of the service group from the system’s current AvailableCapacity value before proceeding with other service groups in the queue.

Note: In the case that no system has a high enough AvailableCapacity value for a service group load, the service group is still started on the system with the highest value for AvailableCapacity, even if the resulting AvailableCapacity value is zero or a negative number.


Failover Rules and PoliciesRules for Automatic Service Group FailoverThe following conditions must be satisfied for a service group to be automatically failed over after a fault:• The service group must contain a critical resource, and that resource must fault

or be a parent of a faulted resource.• The service group AutoFailOver attribute must be set to the default value of 1.

If this attribute is changed to 0, VCS leaves the service group offline and waits for an administrative command to be issued to bring the service group online.

• The service group cannot be frozen. • At least one of the systems in the service group’s SystemList attribute must be

in RUNNING state.

The failover system for the service group is chosen as follows:• A subset of systems included in the SystemList attribute are selected.• Frozen systems are eliminated and systems where the service group has a

FAULTED status are eliminated.• Systems that do not meet the service group requirements are eliminated, as

described in detail later in the lesson.• The target system is chosen from this list based on the failover policy defined

for the service group.

Rules for Automatic Service Group Failover The service group must have a critical resource.The service group AutoFailOver attribute must be set to 1, and ManageFaults must be set to All (default values). The service group cannot be frozen.At least one system in the service group’s SystemList attribute must be up and running.The failover system is selected as follows:– A subset of systems that meet the service group

requirements from among the systems in the SystemList is created first (described later in detail).

– Frozen systems and systems where the service group has a FAULTED status are eliminated from the list.

– Systems that do not meeting service group requirements are eliminated.

– The target system is selected based on the failover policy of the service group.


3

Failover PoliciesVCS supports a variety of policies that determine how a system is selected when service groups must migrate due to faults. The policy is configured by setting the FailOverPolicy attribute to one of these values:• Priority: The system with the lowest priority number is preferred for failover

(default).• RoundRobin: The system with the least number of active service groups is

selected for failover.• Load: The system with the highest value of the AvailableCapacity system

attribute is selected for failover.

Policies are discussed in more detail in the following pages.

Failover PoliciesThe FailOverPolicy attribute specifies how a target system is selected:– Priority: The system with the lowest priority

number in the list is selected (default).– RoundRobin: The system with the least

number of active service groups is selected.– Load: The system with greatest available

capacity is selected.Example configuration:hagrp –modify groupname FailOverPolicy LoadDetailed examples are provided on the next set of pages.


FailOverPolicy=Priority

When FailOverPolicy is set to Priority, VCS selects the system with the lowest assigned value from the SystemList attribute.

For example, the DB service group has three systems configured in the SystemList attribute and the same order for AutoStartList values:SystemList = {SVR3=0, SVR1=1, SVR2=2}

AutoStartList = {SVR3, SVR1, SVR2}

The DB service group is initially started on SVR3 because it is the first system in AutoStartList. If DB faults on SVR3, VCS selects SVR1 as the failover target because it has the lowest priority value for the remaining available systems.

Priority policy is the default behavior and is ideal for simple two-node clusters or small clusters with few service groups.

FailOverPolicy=Priority



Animation


3

FailOverPolicy=RoundRobin

The RoundRobin policy selects the system running the fewest service groups as the failover target.

The round robin policy is ideal for large clusters running many service groups with essentially the same server load characteristics (for example, similar databases or applications).

Consider these properties of the RoundRobin policy:• Only systems listed in the SystemList attribute for the service group are

considered when VCS selects a failover target for all failover policies, including RoundRobin.

• A service group that is in the process of being brought online is not considered an active service group until it is completely online.

Ties are determined by the order of systems in the SystemList attribute. For example, if two failover target systems have the same number of service groups running, the system listed first in the SystemList attribute is selected for failover.

FailOverPolicy=RoundRobin

The system withthe fewest runningservice groups isselected.

The system withthe fewest runningservice groups isselected.

Animation


FailOverPolicy=Load

When FailOverPolicy is set to Load, VCS determines the target system based on the existing workload of each system listed in the SystemList attribute and the load that is added by the service group.

These attributes control load-based failover:• Capacity is a system attribute that contains a value representing the total

amount of load that the system can handle.• Load is a service group attribute that defines the amount of capacity required to

run the service group.• AvailableCapacity is a system attribute maintained by VCS that quantifies the

remaining available system load.

In the example displayed in the slide, three servers have Capacity set to 300, and the fourth is set to 150. Each service group has a fixed load defined by the user, which is subtracted from the system capacity to find the AvailableCapacity value of a system.

When failover occurs, VCS checks the value of AvailableCapacity on each potential target—each system in the SystemList attribute for the service group— and starts the service group on the system with the highest value.

Note: In the event that no system has a high enough AvailableCapacity value for a service group load, the service group still fails over to the system with the highest value for AvailableCapacity, even if the resulting AvailableCapacity value is zero or a negative number.

FailOverPolicy=Load

The system with the greatestAvailableCapacity is selected.

The system with the greatestAvailableCapacity is selected. Animation


3

Integrating Dynamic Load CalculationsThe load-based startup and failover examples in earlier sections were based on static values of load. That is, the Capacity value of each system and the Load value for each service group are fixed user-defined values.

The VCS workload balancing mechanism can be integrated with other software programs, such as Precise, that calculate system load to support failover based on a dynamically set value.

If the DynamicLoad attribute is set for a system, VCS calculates AvailableCapacity by subtracting the value of DynamicLoad from Capacity. In this case, the Load values of service groups are not used to determine AvailableCapacity.

The DynamicLoad value must be set by the load-estimation software using the hasys command. For example:hasys -load Svr1 90

This command sets DynamicLoad to the value of 90. If Capacity is 300 then AvailableCapacity is calculated to be 210 no matter what the Load values of the service groups online on the system are.

Note: If your third-party load-estimation software provides a value that represents the percentage of system load, you must consider the value of Capacity when setting the load. For example, if Capacity is 300 and the load-estimation software determines that the system is 30 percent loaded, you must set the load to 90.

Integrating Dynamic Load CalculationsYou can control VCS startup and failover based on dynamic load by integrating with load-monitoring software, such as Precise.1. External software monitors CPU usage.2. External software sets the DynamicLoad attribute

according to the system Capacity value using hasys –load system value.

Example:The Capacity attribute is set to 300 (static value).Monitoring software determines that CPU usage is 30 percent.External software sets the DynamicLoadattribute to 90 (30 percent of 300).

Example:The Capacity attribute is set to 300 (static value).Monitoring software determines that CPU usage is 30 percent.External software sets the DynamicLoadattribute to 90 (30 percent of 300).


Controlling Overloaded SystemsThe LoadWarning TriggerYou can configure the LoadWarning trigger to provide notification that a system has sustained a predetermined load level for a specified period of time.

To configure the LoadWarning trigger:• Create a loadwarning script in the /opt/VRTSvcs/bin/triggers

directory. You can copy the sample trigger script from /opt/VRTSvcs/bin/sample_triggers as a starting point, and then modify it according to your requirements. See the example script that follows.

• Set the LoadWarning attributes for the system:– Capacity: Load capacity for the system– LoadWarningLevel: The level at which load has reached a critical limit;

expressed as a percentage of the Capacity attributeDefault is 80 percent.

– LoadTimeThreshold: Length of time, in seconds, that a system must remain at, or above, LoadWarningLevel before the trigger is runDefault is 600 seconds.

The LoadWarning TriggerYou can configure the LoadWarning trigger to run whena system has been running at a specified percentage of the Capacity level for a specified period of time.

To configure the trigger:– Copy the sample loadwarning script into

/opt/VRTSvcs/bin/triggers.– Modify the script to perform some action.– Set system attributes.This example configuration causes VCS to run the trigger if the Svr4 system runs at 90 percent of capacity for ten minutes.

System Svr4 ( Capacity=150 LoadWarningLevel=90 LoadTimeThreshold=600

)

System Svr4 ( Capacity=150 LoadWarningLevel=90 LoadTimeThreshold=600

)


3

Example ScriptA portion of the sample script, /opt/VRTSvcs/bin/sample_triggers/loadwarning, is shown to illustrate how you can provide a basic operator warning. You can customize this script to perform other actions, such as switching or shutting down service groups.# @(#)/opt/VRTSvcs/bin/triggers/loadwarning

@recipients=("username\@servername.com");

#

$msgfile="/tmp/loadwarning";

`echo system = $ARGV[0], available capacity = $ARGV[1] > $msgfile`;

foreach $recipient (@recipients) {

## Must have elm setup to run this.

`elm -s loadwarning $recipient < $msgfile`;

}

`rm $msgfile`;

exit


Additional Startup and Failover ControlsLimits and PrerequisitesVCS enables you to define the available resources on each system and the corresponding requirements for these resources for each service group. Shared memory, semaphores, and the number of processors are all examples of resources that can be defined on a system.

Note: The resources that you define are arbitrary—they do not need to correspond to physical or software resources. You then define the corresponding prerequisites for a service group to come online on a system.

In a multinode, multiapplication services environment, VCS keeps track of the available resources on a system by subtracting the resources already in use by service groups online on each system from the maximum capacity for that resource. When a new service group is brought online, VCS checks these available resources against service group prerequisites; the service group cannot be brought online on a system that does not have enough available resources to support the application services.

Limits and Prerequisites


3

System Limits

The Limits system attribute is used to define the resources and the corresponding capacity of each system for that resource. You can use any keyword for a resource as long as you use the same keyword on all systems and service groups.

The example values displayed in the slide are set as follows:• On the first two systems, the Limits attribute setting in main.cf is:

Limits = { CPUs=12, Mem=512 } • On the second two systems, the Limits attribute setting in main.cf is:

Limits = { CPUs=6, Mem=256 }

Service Group Prerequisites

Prerequisites is a service group attribute that defines the set of resources needed to run the service group. These values correspond to the Limits system attribute and are set by the Prerequisites service group attribute. This main.cf configuration corresponds to the SG1 service group in the diagram:Prerequisites = { CPUs=6, Mem=256 }

Current Limits

CurrentLimits is an attribute maintained by VCS that contains the value of the remaining available resources for a system. For example, if the limit for Mem is 512 and the SG1 service group is online with a Mem prerequisite of 256, the CurrentLimits setting for Mem is 256:CurrentLimits = { CPUs=6, Mem=256 }

Selecting a Target SystemPrerequisites are used to determine a subset of eligible systems on which a service group can be started during failover or startup. When a list of eligible systems is created, had then follows the configured policy for auto-start or failover.

Note: A value of 0 is assumed for systems that do not have some or all of the resources defined in their Limits attribute. Similarly, a value of 0 is assumed for service groups that do not have some or all of the resources defined in their Prerequisites attribute.


Combining Capacity and Limits Capacity and Limits can be combined to determine appropriate startup and failover behavior for service groups.

When used together, VCS uses this process to determine the target:1 Prerequisites and Limits are checked to determine a subset of systems that are

potential targets.2 The Capacity and Load attributes are used to determine which system has the

highest AvailableCapacity. value3 When multiple systems have the same AvailableCapacity value, the system

listed first in SystemList is selected.

System Limits are hard values, meaning that if a system does not meet the requirements specified in the Prerequisites attribute for a service group, the service group cannot be started on that system.

Capacity is a soft limit, meaning that the system with the highest value for AvailableCapacity is selected, even if the resulting available capacity is a negative number.

Combining Capacity and Limits


3

Configuring Startup and Failover PoliciesSetting Load and CapacityYou can use the VCS GUI or command-line interface to set the Capacity system attribute and the Load service group attribute.

To set Capacity from the command-line interface, use the hasys -modify command as shown in the following example:hasys -modify S1 Capacity 300

To set Load from the CLI, use the hagrp -modify command as shown in the following example:hagrp -modify G1 Load 75

Setting Load and Capacity

hasys –modify S1 Capacity 300hasys –modify S1 Capacity 300

hagrp –modify G1 Load 75hagrp –modify G1 Load 75

System S1 (Capacity = 300)

System S1 (Capacity = 300)

main.cfmain.cf

group G1 (SystemList = { S1 = 1, S2 = 2 }AutoStartList = { S1, S2 }AutoStartPolicy = LoadLoad = 75)

group G1 (SystemList = { S1 = 1, S2 = 2 }AutoStartList = { S1, S2 }AutoStartPolicy = LoadLoad = 75) main.cfmain.cf


Setting Limits and PrerequisitesYou can use the VCS GUI or command-line interface to set the Limits system attribute and the Prerequisites service group attribute.

To set Limits from the command-line interface, use the hasys -modify command as shown in the following example:hasys -modify S1 Limits Processors 2 Mem 512

To set Prerequisites from the CLI, use the hagrp -modify command as shown in the following example:hagrp -modify G1 Prerequisites Processors 1 Mem 50

Notes:• To be able to set these attributes, open the VCS configuration to enable

read/write mode and ensure that the service groups that are already online on a system do not violate the restrictions.

• The order that the resources are defined within the Limits or Prerequisites attributes is not important.

Setting Limits and Prerequisites

hasys –modify S1 Limits Processors 2 Mem 512hasys –modify S1 Limits Processors 2 Mem 512

System S1 (Limits = { Processors = 2, Mem = 512 })

System S1 (Limits = { Processors = 2, Mem = 512 })

main.cfmain.cf

hagrp –modify G1 Prerequisites Processors 1 Mem 50hagrp –modify G1 Prerequisites Processors 1 Mem 50

group G1 (…Prerequisites = { Processors = 1, Mem = 50 })

group G1 (…Prerequisites = { Processors = 1, Mem = 50 })

main.cfmain.cf


3

• To change an existing Limits or Prerequisites attribute, such as adding a new resource, removing a resource, or updating a resource definition, use the -add, -delete, or -update keywords, respectively, with the hasys -modify or hagrp -modify commands as shown in the following examples:– The command

hasys -modify S1 Limits -add Semaphores 10changes the S1 Limits attribute to Limits = { Processors=2, Mem=512, Semaphores=10 }

– The commandhasys -modify S1 Limits -update Processors 4changes the S1 Limits attribute toLimits = { Processors=4, Mem=512, Semaphores=10 }

– The commandhasys -modify S1 Limits -delete Memchanges the S1 Limits attribute to Limits = { Processors=4, Sempahores=10 }


Using the Simulator Modeling Workload ManagementThe VCS Simulator is a good tool for modeling the behavior that you require before making changes to the running configuration. This enables you to fully understand the implications and the effects of different workload management configurations.

Modeling Workload ManagementYou can use the Simulator tocreate and test workload management scenarios beforedeploying the configuration in arunning cluster. For example:

Copy the real main.cf file into the Simulator directory.Set up the workload management configuration.Test all startup and failover scenarios.Copy the Simulator main.cffile back to the cluster config directory.Restart the cluster using the new configuration.


3

SummaryThis lesson described in detail how VCS chooses a system on which to run a service group, both at startup and during failover. This lesson introduced Service Group Workload Management to enable the VCS administrators to configure VCS behavior. The lesson also showed methods to integrate dynamic load calculations with VCS and to control overloaded systems.

Next StepsThe next lesson describes alternate storage and network configurations, including local NIC failover and integration of third-party volume management software.

Additional ResourcesVERITAS Cluster Server User’s Guide

This document describes VCS Service Group Workload Management. The guide also provides detailed descriptions of resources and triggers, in addition to information about service groups and failover behavior.

Lesson SummaryKey Points – Workload management policies provide fine-

grained control of service group startup and failover.

– You can use the Simulator to model behavior before you implement policies in the cluster.

Reference MaterialsVERITAS Cluster Server User's Guide


Lab 3: Testing Workload ManagementLabs and solutions for this lesson are located on the following pages.

Appendix A provides brief lab instructions for experienced students.• “Lab 3 Synopsis: Testing Workload Management,” page A-14

Appendix B provides step-by-step lab instructions.• “Lab 3 Details: Testing Workload Management,” page B-29

Appendix C provides complete lab instructions and solutions.• “Lab 3 Solution: Testing Workload Management,” page C-45

GoalThe purpose of this lab is to use the Simulator with a preconfigured main.cf file and observe the effects of workload management on manual and failover operations.

PrerequisitesObtain any classroom-specific values needed for your classroom lab environment and record these values in your design worksheet included with the lab exercise instructions.

ResultsDocument the effects of workload management in the lab appendix.

Lab 3: Testing Workload Management

Simulator config file location:_________________________________________

Copy to:___________________________________________


Copy to:___________________________________________



Lesson 4Alternate Storage and Network Configurations


IntroductionOverviewThis lesson describes how you can integrate different types of volume management software within your cluster configuration, as well as the use of raw disks. You also learn how to configure alternative network resources that enable local NIC failover.

ImportanceThe alternate storage and network configurations discussed in this lesson are examples to show you the flexibility that VCS provides. More specifically, one of the examples discusses how to avoid failover due to networking problems using multiple interfaces on a system.

Lesson Introduction


Lesson 4 Alternate Storage and Network Configurations 4–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

4Outline of Topics• Alternative Storage and Network Configurations• Additional Network Resources• Additional Network Design Requirements • Example MultiNIC Setup

Describe additional network design requirements for Solaris.

Additional Network Design Requirements

Describe an example MultiNIC setup in VCS.

Example MultiNIC Setup

Configure additional VCS network resources.

Additional Network Resources

Implement storage and network configuration alternatives.

Alternative Storage and Network Configurations


Topic



Alternative Storage and Network ConfigurationsVCS provides the following bundled resource types as an alternative to using VERITAS Volume Manager for storage:• Solaris: Disk and DiskReservation resource type and agent• AIX: LVMVolumeGroup resource type and agent• HP-UX: LVMVolumeGroup, LVMLogicalVolume, or LVMCombo resource

types and agents• Linux: DiskReservation

Before placing the corresponding storage resource under VCS control, you need to prepare the storage component as follows:1 Create the physical resource on one system.2 Verify the functionality on the first system.3 Stop the resource on the first system.4 Migrate the resource to the next system in the cluster.5 Verify functionality on the next system.6 Stop the resource.7 Repeat steps 4-6 until all the systems in the cluster are tested.

The following pages describe the resource types that you can use on each platform in detail.

Alternative Storage ConfigurationsBundled resource types for raw disk or third-partyvolume management software supported by VCS:

Solaris: Disk, DiskReservationAIX: LVMVolumeGroupHP-UX: LVMVolumeGroup, LVMLogicalVolume, or

LVMComboLinux: DiskReservation

Solaris HP-UXAIX Linux

Create a physical resource on one system.

Other systemin cluster?

Done

Verify accessibility on the first system. Verify accessibilty.

N

Y


4

Solaris

The Disk Resource and Agent on SolarisThe Disk agent monitors a disk partition. Because disks are persistent resources, the Disk agent does not bring disk resources online or take them offline.

Agent Functions• Online: None• Offline: None• Monitor: Determines if the disk is accessible by attempting to read data from

the specified UNIX device

Required Attributes

Partition: UNIX partition device name

Note: The Partition attribute is specified with the full path beginning with a slash (/). Otherwise, the given name is assumed to reside in /dev/rdsk.

There are no optional attributes for this resource type.

Configuration Prerequisites

You must create the disk partition in UNIX using the format command.

Sample ConfigurationDisk myNFSDisk {

Partition=c1t0d0s0

}

The DiskReservation Resource and Agent on SolarisThe DiskReservation agent puts a SCSI-II reservation on the specified disks.

Functions• Online: Brings the resource online after reserving the specified disks• Offline: Releases the reservation• Monitor: Checks the accessibility and reservation status of the specified disks

Required Attributes

Disks: The list of raw disk devices specified with absolute or relative path names

Optional Attributes

FailFast, ConfigPercentage, ProbeInterval

Configuration Prerequisites• Verify that the device path to the disk is recognized by all systems sharing the

disk.


• Do not use disks configured as resources of type DiskReservation for disk heartbeats.

• Disable the Reset SCSI Bus at IC Initialization option from the SCSI Select utility.

Sample ConfigurationDiskReservation DR (

Disks = {c0t2d0s2, c1t2d0s2, c2t2d0s2 }

FailFast = 1

ConfigPercentage = 80

ProbeInterval = 6

)

AIX

The LVMVolumeGroup Agent on AIX

Agent Functions• Online: Activates the LVM volume group• Offline: Deactivates the LVM volume group• Monitor: Checks if the volume group is available using the vgdisplay

command• Clean: Terminates ongoing actions associated with a resource (perhaps

forcibly)

Required Attributes• Disks: The list of disks underneath the volume group• MajorNumber: The integer that represents the major number of the volume

group• VolumeGroup: The name of the LVM volume group

Optional Attributes

ImportvgOpt, VaryonvgOpt, and SyncODM

Configuration Prerequisites• The volume group and all of its logical volumes should already be configured.• The volume group should be imported but not activated on all systems in the

cluster.

Sample Configurationsystem sysA

system sysB


4

group lvmgroup (SystemList = { sysA, sysB }AutoStartList = { sysA }

LVMVG lvmvg_vg1 (VolumeGroup = vg1MajorNumber = 50Disks = { hdisk22, hdisk23, hdisk45})

LVMVG lvmvg_vg2 (VolumeGroup = vg2MajorNumber = 51Disks@sysA = { hdisk37, hdisk38, hdisk39}Disks@sysB = { hdisk61, hdisk62, hdisk63}ImportvgOpt = "f")

HP-UX

LVM Setup on HP-UXOn all systems in the cluster:• The volume groups and volumes that are on the shared disk array are

controlled by the HA software. Therefore, you need to prevent each system from activating these volumes automatically during bootup. To do this, edit the /etc/lvmrc file:– Set AUTO_VG_ACTIVATE to 0.– Verify that there is a line in the /etc/lvmrc file in the

custom_vg_activation() function that activates the vg00 volume group. Add lines to start volume groups that are not part of the HA environment in the custom_vg_activation() function:/sbin/vgchange –a y /dev/vgaa

• Each system should have the device nodes for the volume groups on shared devices. Create a device for volume groups:mkdir /dev/vgnnmknod /dev/vgnn/group c 64 0x0m0000 The same minor number (m) has to be used for NFS. By default, this value must be in the range of 1-9.

• Do not create entries in /etc/fstab or /etc/exports for the mount points that will be part of the HA environment. The file systems in the HA environment will be mounted and shared by VCS. Therefore, the system should not mount or share these file systems during system boot.


On one of the systems in the cluster:• Configure volume groups, logical volumes, and file systems.• Deactivate volume groups:

vgexport –p –s –m /tmp/mapfile /dev/vgnnrcp /tmp/mapfile othersystems:/tmp/mapfile

On each system in the cluster:• Import and activate the volume groups:

vgimport –s –m /tmp/mapfile /dev/vgnn vgchange –a y /dev/vgnn

• Create mount points and test.• Deactivate volume groups.

Note: Create the volume groups, volumes, and file systems on the shared disk array on only one of the systems in the cluster. However, you need to verify that they can be manually moved from one system to the other by exporting and importing the volume groups on the other systems. Note that you need to create the volume group directory and the group file on each system before importing the volume group. At the end of the verification, ensure that the volume groups on the shared storage array are deactivated on all the systems in the cluster.

There are three resource types that can be used to manage LVM volume groups and logical volumes: LVMVolumeGroup, LVMLogicalVolume, and LVMCombo.

The LVMVolumeGroup Resource and Agent on HP-UX

Agent Functions• Online: Activates the LVM volume group• Offline: Deactivates the LVM volume group• Monitor: Checks if the volume group is available using the vgdisplay

command

Required Attributes

VolumeGroup: The name of the LVM volume group


Configuration Prerequisites• The volume group and all of its logical volumes should already be configured.• The volume group should be imported but not activated on all systems in the

cluster.

Sample ConfigurationLVMVolumeGroup MyNFSVolumeGroup (

VolumeGroup = vg01

)


4

LVMLogicalVolume Resource and Agent on HP-UX

Agent Functions• Online: Activates the LVM logical volume• Offline: Deactivates the LVM logical volume• Monitor: Determines if the logical volume is available by performing a read

I/O on the raw logical volume

Required Attributes• LogicalVolume: The name of the LVM logical volume• VolumeGroup: The name of the LVM volume group


Configuration Prerequisites• Configure the LVM volume group and the logical volume.• Configure the VCS LVMVolumeGroup resource on which this logical volume

depends.

Sample ConfigurationLVMLogicalVolume MyNFSLVolume (

LogicalVolume = lvol1VolumeGroup = vg01

)

LVMCombo Resource and Agent on HP-UX

Agent Functions• Online: Activates the LVM volume group and its volumes• Offline: Deactivates the LVM volume group• Monitor: Checks if the volume group and all of its logical volumes are

available

Required Attributes• VolumeGroup: The name of the LVM volume group• LogicalVolumes: The list of logical volumes


Configuration Prerequisites• The volume group and its volumes should be configured.• The volume group should be imported but not activated on all systems in the

cluster.


Sample ConfigurationLVMCombo MyNFSVolumeGroup (

VolumeGroup = vg01LogicalVolumes = { lvol1, lvol2 }

)

Linux

The DiskReservation Resource and Agent on LinuxThe DiskReservation agent puts a SCSI-II reservation on the specified disks.

Functions• Online: Brings the resource online after reserving the specified disks• Offline: Releases reservation• Monitor: Checks the accessibility and reservation status of the specified disks

Required Attributes

Disks: The list of raw disk devices specified with absolute or relative path names

Optional Attributes

FailFast, ConfigPercentage, ProbeInterval

Configuration Prerequisites• Verify that the device path to the disk is recognized by all systems sharing the

disk. • Do not use disks configured as resources of type DiskReservation for disk

heartbeats.• Disable the Reset SCSI Bus at IC Initialization option from the SCSI Select

utility.

Sample ConfigurationDiskReservation diskres1 (

Disks = {"/dev/sdc"}

FailFast = 1

)


4Alternative Network Configurations

Local Network Interface Failover

In a client-server environment using TCP/IP, applications often connect to cluster resources using an IP address. VCS provides IP and NIC resources to manage an IP address and network interface.

With this type of high availability network design, a problem with the network or IP address causes service groups to fail over to other systems. This means that the applications and all required resources are taken offline on the system where the fault occurred and are then brought online on another system. If no other systems are available for failover then users experience a service downtime until the problem with the network connection or IP address is corrected.

With the availability of inexpensive network adapters, it is common to have many network interfaces on each system. By allocating more than one network interface to a service group, you can potentially avoid failover of the entire service group if the interface fails. By moving the IP address on the failed interface to another interface on the local system, you can minimize downtime.

VCS provides this type of local failover with the MultiNICA and IPMultiNIC resources. On the Solaris an AIX platforms, there are alternative resource types called MultiNICB and IPMultiNICB with additional features that can be used to address the same design requirement. Both resource types are discussed in detail later in this section.

Local Network Interface Failover

You can configure VCS to fail application IP addresses over to a local network interface beforefailing over to another system.

port1

port2

port5port4port3

S1

port1

port2

port5port4port3

S2MultiNICA

or MultiNICB (Solaris- and AIX-only)

10.10.198.2

10.10.198.2


Advantages of Local Interface Failover

Local interface failover can drastically reduce service interruptions to the clients. Some applications have time-consuming shutdown and startup processes that result in substantial downtime when the application fails over from one system to another.

Failover between local interfaces can be completely transparent to users for some applications.

Using multiple networks also makes it possible to eliminate any switch or hub failures causing service group failover as long as the multiple interfaces on the system are connected to separate hubs or switches.


4Network Resources OverviewThe MultiNICA agent is capable of monitoring multiple network interfaces and if one of these interfaces faults, VCS fails over the IP address defined by the IPMultiNIC resource to the next available public network adapter.

The IPMultiNIC and MultiNICA resources provide essentially the same service as the IP and NIC resources, but monitor multiple interfaces instead of a single interface. The dependency between these resources is the same as the dependency between IP and NIC resources.

On the Solaris platform, the MultiNICB and IPMultiNICB agents provide the same functionality as the MultiNICA and IPMultiNIC agents with many additional features, such as:• Support for the Solaris IP multipathing daemon• Support for trunked network interfaces on Solaris• Support for faster failover• Support for active/active interfaces• Support for manual failback

With the MultiNICB agent, the logical IP addresses are failed back when the original physical interface comes up after a failure .

Note: This lesson provides detailed information about MultiNICB and IPMultiNICB on Solaris only. For AIX-specific information, see the VERITAS Cluster Server for AIX Bundled Agents Reference Guide.

Network Resources Overview

The IP and NIC relationship correlates to thethe IPMultiNIC and MultiNICA relationship, or the IPMultiNICB and MultiNICB relationship.

The IP and NIC relationship correlates to thethe IPMultiNIC and MultiNICA relationship, or the IPMultiNICB and MultiNICB relationship.

NIC

IP

MultiNICA

IPMultiNIC

Manages virtual IP addressesManages virtual IP addresses

Manages multiple interfacesManages multiple interfaces

MultiNICB

IPMultiNICB

Solaris and AIX only


Additional Network ResourcesThe MultiNICA Resource and AgentThe MultiNICA agent monitors specified network interfaces and moves the administrative IP address among them in the event of failure. The agent functions and the required attributes for the MultiNICA resource type are listed on the slide.

Key Points• The MultiNICA resource is marked online if the agent can ping at least one

host in the list provided by NetworkHosts. If NetworkHosts is not specified, Monitor broadcasts to the subnet of the administrative IP address on the interface. Monitor counts the number of packets passing through the device before and after the address is pinged. If the count decreases or remains the same, the resource is marked as offline.

• Do not use other systems in the cluster as part of NetworkHosts. NetworkHosts normally contains devices that are always available on the network, such as routers, hubs, or switches.

• When configuring the NetworkHosts attribute, you are recommended to use the IP addresses rather than the host names to remove dependency on the DNS.

Required for AIX, Linux

Required for AIX, Linux

Required for HP-UX

Required for HP-UX

The MultiNICA Resource and AgentAgent functions:Online NoneOffline NoneMonitor Monitor uses ping to connect to hosts in

NetworkHosts. If NetworkHosts is not specified, itbroadcasts to the network address.

Required attributes:Device The list of network interfaces and a unique

administrative IP address for each system that is assigned to the active device

NetworkHosts The list of IP addresses on the networkthat are pinged to test the networkconnection

NetMask The network mask for the base IP address


4

Optional Attributes

Following is a list of optional attributes of the MultiNICA resource type for the supported platforms:• HandshakeInterval (not used on Linux): Used to compute the number of times

that the monitor pings after migrating to a new NIC The value should be set to a multiple of 10. The default value is 90.Note: This attribute determines how long it takes to detect a failed interface and therefore affects failover time. The value must be greater than 50. Otherwise, the value is ignored, and the default of 90 is used.

• Options: The options used with ifconfig to configure the administrative IP address

• RouteOptions: The string to add a route when configuring an interface This string contains the three values: destination gateway metric. No routes are added if this string is set to NULL.

• PingOptimize (not used on HP-UX): The number of monitor cycles used to detect if the configured interface is inactiveA value of 1 optimizes broadcast pings and requires two monitor cycles. A value of 0 performs a broadcast ping during each monitor cycle and detects the inactive interface within the cycle. The default is 1.

• IfconfigTwice (Solaris- and HP-UX-only): If set to 1, this attribute causes an IP address to be configured twice, using an ifconfig up-down-up sequence, and increases the probability of gratuitous arps (caused by ifconfig up) reaching clients. The default is 0.

• ArpDelay (Solaris- and HP-UX-only): The number of seconds to sleep between configuring an interface and sending out a broadcast to inform the routers of the administrative IP addressThe default is 1 second.

• RetestInterval (Solaris-only): The number of seconds to sleep between retests of a newly configured interfaceThe default is 5.Note: A lower value results in faster local (interface-to-interface) failover.

• BroadcastAddr (AIX-only): Broadcast address for the base IP address on the interfaceNote: This attribute is required on AIX if the agent has to use the broadcast address for the interface.

• Domain (AIX-only): Domain nameNote: This attribute is required on AIX if a domain name is used.

• Gateway (AIX-only): The IP address of the default gatewayNote: This attribute is required on AIX if a default gateway is used.

• NameServerAddr (AIX-only): The IP address of the name serverNote: This attribute is required on AIX if a name server is used.


• FailoverInterval (Linux-only): The interval, in seconds, to wait to check if the NIC is active during failover During this interval, ping requests are sent out to determine if the NIC is active. If the NIC is not active, the next NIC in the Device list is tested. The default is 60 seconds.

• FailoverPingCount (Linux-only): The number of times to send ping requests during the FailoverIntervalThe default is 4.

• AgentDebug (Linux-only): If set to 1, this flag causes the agent to log additional debug messages.The default is 0.


4MultiNICA Resource ConfigurationThe slide displays how you need to prepare the physical resource before you put it under VCS control using the MultiNICA resource type.

The resource type definition in the types.cf file displays the default values for MultiNICA optional attributes. Refer to the VERITAS Cluster Server Bundled Agents Reference Guide for more information on the MultiNICA resource type.

Here are some sample configurations for the MultiNICA resource on various platforms:

Solaris

MultiNICA mnic_sol (Device@S1 = { le0 = "10.128.8.42",

qfe3 = "10.128.8.42" }Device@S2 = { le0 = "10.128.8.43",

qfe3 = "10.128.8.43" }NetMask = "255.255.255.0"ArpDelay = 5Options = "trailers"

)

MultiNICA Resource ConfigurationConfiguration prerequisites:- NICs on the same system must be on the same

network segment.- Configure an administrative IP address for one of

the network interfaces for each system.

MultiNICA mnic (Device@S1 = { en3="10.128.8.42", en4="10.128.8.42" }Device@S2 = { en3="10.128.8.43", en4="10.128.8.43" }NetMask = "255.255.255.0“NameServerAddr = "10.130.8.1“…

)

AIX Sample ConfigurationAIX Sample

Configuration


AIX

MultiNICA mnic_aix (Device@S1 = { en0 = "10.128.8.42",

en3 = "10.128.8.42" }Device@S2 = { en0 = "10.128.8.43",

en3 = "10.128.8.43" }NetMask = "255.255.255.0"NameServerAddr = "10.128.1.100"Gateway = "10.128.8.1"Domain = "veritas.com"BroadcastAddr = "10.128.8.255"Options = "mtu m"

)HP-UX

MultiNICA mnic_hp (Device@S1 = { lan0 = "10.128.8.42",

lan3 = "10.128.8.42" }Device@S2 = { lan0 = "10.128.8.43",

lan3 = "10.128.8.43" }NetMask = "255.255.255.0"Options = "arp"RouteOptions@S1 = "default 10.128.8.42 0"RouteOptions@S2 = "default 10.128.8.43 0"NetWorkHosts = { "10.128.8.44", "10.128.8.50" }

)Linux

MultiNICA mnic_lnx (Device@S1 = { eth0 = "10.128.8.42",

eth1 = "10.128.8.42" }Device@S2 = { eth0 = "10.128.8.43",

eth2 = "10.128.8.43" }NetMask = "255.255.250.0"NetworkHosts = { "10.128.8.44", "10.128.8.50" }

)


4Configuring Local Attributes

MultiNICA is configured similarly to any other resource using hares commands. However, you need to specify different IP addresses for the Device attribute so that each system has a unique administrative IP address for the local network interface.

An attribute whose value applies to all systems is global in scope. An attribute whose value applies on a per-system basis is local in scope. By default, all attributes are global. Some attributes can be localized to enable you to specify different values for different systems. These specifications are required when configuring MultiNICA to specify unique administrative IP addresses for each system.

Localizing the attribute means that each system in the service group’s SystemList has a value assigned to it. The value is initially set the same for each system—the value that was configured before the localization. After an attribute is localized, you can modify the values to be unique for different systems.

Localizing MultiNIC AttributesLocalize the Device attribute to set a uniqueadministrative IP address for each system.

hares –local mnic Devicehares –modify mnic Device en0 10.128.8.42 –sys S1hares –local mnic Devicehares –modify mnic Device en0 10.128.8.42 –sys S1

10.128.8.42

mnic


MultiNICA FailoverThe diagram in the slide gives a conceptual view of how the agent fails over the administrative IP address on that physical interface to another physical interface under its control if one of the interfaces faults.

Local MultiNICA Failover

The MultiNICA agent:1. Sends a ping to the

subnet broadcast address (or NetworkHosts, if specified)

2. Compares packet counts and detects a fault

3. Configures the administrative IP on the next interface in the Device attribute

10.128.8.42

ping 10.128.8.255

en0

en3

Request timed out.Ping statistics for 10.128.8.255Packets: Sent = 4, Received = 0

ifconfig en3 inet 10.128.8.42ifconfig en3 up

AIXAIX

1

2

3


4The IPMultiNIC Resource and AgentThe IPMultiNIC agent monitors the virtual (logical) IP address configured as an alias on one interface of a MultiNICA resource. If the interface faults, the agent works with the MultiNICA resource to fail over to a backup interface. If multiple service groups have IPMultiNICs associated with the same MultiNICA resource, only one group has the MultiNICA resource. The other groups have Proxy resources pointing to it.

The agent functions and the required attributes for the IPMultiNIC resource type are listed on the slide.

Note: It is recommended to set the RestartLimit attribute of the IPMultiNIC resource to a nonzero value to prevent spurious resource faults during a local failover of the MultiNICA resource.

Required for AIX

Required for AIX

The IPMultiNIC Resource and AgentAgent functions:

Online Configures an IP alias (known as the virtual orapplication IP address) on an active network device in the specified MultiNICA resource

Offline Removes the IP aliasMonitor Determines whether the IP address is up on one of

the interfaces used by the MultiNICA resourceRequired attributes:MultiNICResNameThe name of the MultiNICA resource for this

virtual IP address (called MultiNICAResNameon AIX and Linux)

Address The IP address assigned to the MultiNICAresource, used by network clients

Netmask The netmask for the virtual IP address


Optional Attributes

Following is a list of optional attributes of the IPMultiNIC resource type for the supported platforms:• Options: Options used with ifconfig to configure the virtual IP address• IfconfigTwice (Solaris- and HP-UX-only): If set to 1, this attribute causes an

IP address to be configured twice, using an ifconfig up-down-up sequence, and increases the probability of gratuitous arps (caused by ifconfig up) reaching clients.The default is 0.

IPMultiNIC Resource Configuration

The IPMultiNIC resource requires a MultiNICA resource to the interface on which it should configure the virtual IP.

Note: Do not configure the virtual service group IP address at the operating system level. The IPMultiNIC agent must be able to configure this address.

IPMultiNIC Resource ConfigurationOptional attributes:

Options, IfconfigTwice (Solaris- and HP-UX-only)

Configuration prerequisites:The MultiNICA agent must be running to inform the IPMultiNIC agent of the available interfaces.

IPMultiNIC ip1 (Address = "10.128.10.14"NetMask = "255.255.255.0"MultiNICAResName = mnic)

IPMultiNIC ip1 (Address = "10.128.10.14"NetMask = "255.255.255.0"MultiNICAResName = mnic)

MultiNICA mnic (Device@S1 = { en0="10.128.8.42",

en3="10.128.8.42" }Device@S2 = { en0="10.128.8.43",

en3="10.128.8.43" }NetMask = "255.255.255.0")

MultiNICA mnic (Device@S1 = { en0="10.128.8.42",

en3="10.128.8.42" }Device@S2 = { en0="10.128.8.43",

en3="10.128.8.43" }NetMask = "255.255.255.0")

AIX Sample ConfigurationAIX Sample

Configuration


4

Following are some sample configurations for the IPMultiNIC resource on the supported platforms:

Solaris

MultiNICA mnic_sol (Device@S1 = { le0 = "10.128.8.42",

qfe3 = "10.128.8.42" }Device@S2 = { le0 = "10.128.8.43",

qfe3 = "10.128.8.43" }NetMask = "255.255.255.0"ArpDelay = 5Options = "trailers"

)

IPMultiNIC ip_sol (Address = "10.128.10.14"NetMask = "255.255.255.0"MultiNICResName = mnic_solOptions = "trailers"

)

ip_sol requires mnic_solAIX

MultiNICA mnic_aix (Device@S1 = { en0 = "10.128.8.42",

en3 = "10.128.8.42" }Device@S2 = { en0 = "10.128.8.43",

en3 = "10.128.8.43" }NetMask = "255.255.255.0"NameServerAddr = "10.128.1.100"Gateway = "10.128.8.1"Domain = "veritas.com"BroadcastAddr = "10.128.8.255"Options = "mtu m"

)

IPMultiNIC ip_aix (Address = "10.128.10.14"NetMask = "255.255.255.0"MultiNICAResName = mnic_aixOptions = "mtu m"

)

ip_aix requires mnic_aix


HP-UX

MultiNICA mnic_hp (Device@S1 = { lan0 = "10.128.8.42",

lan3 = "10.128.8.42" }Device@S2 = { lan0 = "10.128.8.43",

lan3 = "10.128.8.43" }NetMask = "255.255.255.0"Options = "arp"RouteOptions@S1 = "default 10.128.8.42 0"RouteOptions@S2 = "default 10.128.8.43 0"NetWorkHosts = { "10.128.8.44", "10.128.8.50" }

)

IPMultiNIC ip_hp (Address = "10.128.10.14"NetMask = "255.255.255.0"MultiNICResName = mnic_hpOptions = "arp"

)

ip_hp requires mnic_hpLinux

MultiNICA mnic_lnx (Device@S1 = { eth0 = "10.128.8.42",

eth1 = "10.128.8.42" }Device@S2 = { eth0 = "10.128.8.43",

eth2 = "10.128.8.43" }NetMask = "255.255.250.0"NetworkHosts = { "10.128.8.44", "10.128.8.50" }

)

IPMultiNIC ip_lnx (Address = "10.128.10.14"MultiNICAResName = mnic_lnxNetMask = "255.255.250.0"

)

ip_lnx requires mnic_lnx


4IPMultiNIC FailoverThe diagram gives a conceptual view of what happens when all network interfaces that are part of the MultiNICA configuration fault. In this example, en0 fails first, and the MultiNICA agent brings up the administrative IP address on en3. Then en3 fails, and the MultiNICA resource faults. The service group containing the MultiNICA and IPMultiNIC resources faults on the first system and fails over to the other system.

The MultiNICA is brought online first, and the agent brings up a unique administrative IP address on en0. Next, the IPMultiNIC resource is brought online, and the agent brings up the virtual IP address on en0.

IPMultiNIC Failover

en0

en3

10.128.8.42 AIXAIX

1

en0

en3

210.10.23.45

3

1. IPMultiNIC brings up the virtual IP address on S1.ifconfig en0 inet 10.10.23.45 alias

2. en0 fails and MultiNICA agent moves the admin IP to en3.ifconfig en3 inet 10.128.8.42 ifconfig en3 up

3. en3 fails. The service group with MultiNICA and IPMultiNIC fails over to S2. 4. MultiNICA comes online on S2 and brings up the admin IP; IPMultiNIC comes

online next and brings up the virtual IP.ifconfig en0 inet 10.128.8.43 ifconfig en0 upifconfig en0 inet 10.10.23.45 alias

10.128.8.43

10.10.23.454


Additional Network Design Requirements MultiNICB and IPMultiNICBThese additional agents are supported on VCS versions for Solaris and AIX. Solaris support is described in detail in the lesson. For AIX configuration information, see the VERITAS Cluster Server 4.0 for AIX Bundled Agents Reference Guide.

Solaris-Specific Capabilities

Solaris provides an IP multipathing daemon (mpathd) that can be used to provide local interface failover for network resources at the OS level. IP multipathing also balances outbound traffic between working interfaces.

Solaris also has the capability to use several network interfaces as a single connection that has a bandwidth equal to the sum of individual interfaces. This capability is known as trunking. Trunking is an add-on feature that balances both inbound and outbound traffic.

Both of these features can be used to provide the redundancy of multiple network interfaces for a specific application IP. The MultiNICA and IPMultiNIC resources do not support these features. VERITAS provides MultiNICB and IPMultiNICB resource types for use with multipathing or trunking on Solaris only.

MultiNICB and IPMultiNICBOn Solaris, these agents support:

The multipathing daemon for networkingTrunked network interfacesLocal interface failover times less than 30 seconds

MultiNICB / IPMultiNICB

For AIX-specific support of MultiNICB and IPMultiNICB, see the VERITAS Cluster Server for AIX Bundled Agents Reference GuideFor AIX-specific support of MultiNICB and IPMultiNICB, see the VERITAS Cluster Server for AIX Bundled Agents Reference Guide


4How the MultiNICB Agent OperatesThe MultiNICB agent monitors the specified interfaces differently, depending on whether the resource is configured in base or multipathing (mpathd) modes.

In base mode, you can configure one or a combination of monitoring methods. In base mode, the agent can:• Use system calls to query the interface device driver and check the link status.

Using system calls is the fastest way to check interfaces, but this method only detects failures caused by cable disconnections.

• Send ICMP packets to a network host.You can configure the MultiNICB resource to have the agent check status by sending ICMP pings to determine if the interfaces are working. You can use this method in conjunction with link status checking.

• Send an ICMP broadcast and use the first responding IP address as the network host for future ICMP echo requests.

Note: AIX supports only base mode for MultiNICB.

On Solaris 8 and later, you can configure MultiNICB to work with the IP multipathing daemon. In this situation, MultiNICB functionality is limited to monitoring the FAILED flag on physical interfaces and monitoring mpathd.

In both cases, MultiNICB writes the status of each interface to an export information file, which can be read by other agents (such as IPMultiNICB) or commands (such as haipswitch).

MultiNICB ModesThe MultiNICB agent monitors interfaces using different methods based on whether Solaris IP multipathing is used.

Base mode:– Uses system calls to query the interface device driver– Sends ICMP echo request packets to a network host– Broadcasts an ICMP echo and uses the first reply as a

network hostmpathd mode:– Checks the multipathing daemon (in.mpathd) for the FAILED flag

– Monitors the in.mpathd daemon

Only base mode is supported on AIX.Only base mode is supported on AIX.


MultiNICB Failover

If one of the physical interfaces under MultiNICB control goes down, the agent fails over the logical IP addresses on that physical interface to another physical interface under its control.

When the MultiNICB resource is set to multipathing (mpathd) mode, the agent writes the status of each interface to an internal export information structure and takes no other action when a failed status is returned from the mpathd daemon. The multipathing daemon migrates the logical IP addresses.

MultiNICB FailoverIf a MultiNICB interface fails, the agent:

In base mode:– Fails over all logical IP addresses configured on

that interface to another physical interface under its control

– Writes the status to an internal export information structure that is read by IPMultiNICB

In mpathd mode:– Writes the failed status from the mpathd daemon to

the export structure– Takes no other action; mpathd migrates logical IP

addresses


4The MultiNICB Resource and AgentThe agent functions and the required attributes for the MultiNICB resource type are listed on the slide.

Key Points

These are the key points of MultiNICB operation:• Monitor functionality depends on the operating mode of the MultiNICB agent.• In both modes, the interface status information is written to a file.• After a failover, if the original interface becomes operational again, the virtual

IP addresses are failed back.• When a MultiNICB resource is enabled, the agent expects all physical

interfaces under the resource to be plumbed and configured with the test IP addresses by the OS.

MultiNICB has only one required attribute: Device. This attribute specifies the list of interfaces, and optionally their aliases, that are controlled by the resource. An example configuration is shown in a later section.

The MultiNICB Resource and AgentAgent functions:Open Allocates an internal structure for resource

informationClose Frees the internal structure for resource

informationMonitor Checks the status using one or more of the

configured methods, writes interface status information to an the internal structure that is read by IPMultiNICB, and fails over (and back) logical (virtual) IP addresses among configured interfaces

Required attributes:Device The list of network interfaces and optionally

their aliases, that can be used by IPMultiNICB


MultiNICB Optional Attributes

Two optional attributes are used to set the mode:• MpathdCommand: The path to the mpathd executable that stops or restarts

mpathdThe default is /sbin/in.mpathd.

• UseMpathd: When this attribute is set to 1, MultiNICB restarts mpathd if it is not running already. This setting is allowed only on Solaris 8, 9, or 10 systems. If this attribute is set to 0, in.mpathd is stopped. All MultiNICB resources on the same system must have the same value for this attribute. The default is 0.

mpathd Mode Optional Attributes• ConfigCheck: If set to 1, MultiNICB checks the interface configuration. The

default is 1.• MpathdRestart: If set to 1, MultiNICB attempts to restart mpathd. The default

is 1.

MultiNICB Optional AttributesSetting the mode:UseMpathd Starts or stops mpathd (1,0); when set to 0,

based mode is specifiedMpathdCommandSets the path to mpathd executablempathd mode:ConfigCheck When set, agent makes these checks:

? All interfaces are in the same subnet and service group.

? No other interfaces are on this subnet.? nofailover and deprecated flags are set

on test IP addresses.MpathdRestart Attempts to restart mpathd

Solaris 8, 9, 10 onlySolaris 8, 9, 10 only


4

Base Mode Optional Attributes• Failback: If set to 1, MultiNICB fails virtual IP addresses back to original

physical interfaces, if possible. The default is 0.• IgnoreLinkStatus: When this attribute is set to 1, driver-reported status is

ignored. This attribute must be set when using trunked interfaces. The default is 1.

• LinkTestRatio: Determines the monitor cycles to which packets are sent and checks driver-reported link statusFor example, when this attribute is set to 3 (default), the agent sends a packet to test the interface every third monitor cycle. At all other monitor cycles, the link is tested by checking the link status reported by the device driver.

• NoBroadcast: Prevents the agent from broadcastingThe default is 0—broadcasts are allowed.

• DefaultRouter: Adds the specified default route when the resource is brought online and removes the default route when the resource is taken offlineThe default is 0.0.0.0.

• NetworkHosts: The IP addresses used to monitor the interfacesThese addresses must be directly accessible on the LAN. The default is null.

• NetworkTimeout: The amount of time that the agent waits for responses from network hostsThe default is 100 milliseconds.

MultiNICB Base Mode Optional AttributesKey base mode optional attributes:

– Failback Fails virtual IP addresses back to original physical interfaces, if possible

– IgnoreLinkStatus Ignores driver-report status—must be set when using trunkedinterfaces

– NetworkHosts The list of IP addresses directly accessible on the LAN used to monitor the interfaces

– NoBroadcast Useful if ICMP ping is disallowed for security, for example

See the VERITAS Cluster Server Bundled Agents Reference Guide for a complete description of all optional attributes.


• OnlineTestRepeatCount, OfflineTestRepeatCount: The number of times an interface is tested if the status changesFor every repetition of the test, the next system in NetworkHosts is selected in a round-robin manner. A greater value prevents spurious changes, but it also increases the response time. The default is 3.

The resource type definition in the types.cf file displays the default values for MultiNICB attributes:type MultiNICB (

static int MonitorInterval = 10static int OfflineMonitorInterval = 60static int MonitorTimeout = 60static int Operations = Nonestatic str ArgList[] = { UseMpathd,MpathdCommand,

ConfigCheck,MpathdRestart,Device,NetworkHosts,LinkTestRatio,IgnoreLinkStatus,NetworkTimeout,OnlineTestRepeatCount,OfflineTestRepeatCount,NoBroadcast,DefaultRouter,Failback }

int UseMpathd = 0str MpathdCommand = "/sbin/in.mpathd"int ConfigCheck = 1int MpathdRestart = 1str Device{}str NetworkHosts[]int LinkTestRatio = 1int IgnoreLinkStatus = 1int NetworkTimeout = 100int OnlineTestRepeatCount = 3int OfflineTestRepeatCount = 3int NoBroadcast = 0str DefaultRouter = "0.0.0.0"int Failback = 0

)


4MultiNICB Configuration Prerequisites

You must ensure that all the requirements are met for the MultiNICB agent to function properly. In addition to the general requirements listed in the slide, check these operating system-specific requirements:• For Solaris 6 and Solaris 7, disable IP interface groups by using the command:

ndd -set /dev/ip ip_enable_group_ifs 0 • For Solaris 8 and later:

– Use Solaris 8 release 10/00 or later.– To use MultiNICB with multipathing:

› Read the IP Network Multipathing Administration Guide from Sun.› Set the nofailover and deprecated flags for the test IP

addresses at boot time.› Verify that the /etc/default/mpathd file includes the line:

TRACK_INTERFACES_ONLY_WITH_GROUPS=yes

MultiNICB Configuration PrerequisitesConfiguration prerequisites:

A unique MAC address is required for each interface.Interfaces are plumbed and configured with a test IP address at boot time. Test IP addresses must be on a single subnet, which must be used only for the MultiNICB resource.If using multipathing (Solaris 8 and later only):– Set UseMpathd to 1.– Set /etc/default/mpathd: TRACK_INTERFACES_ONLY_WITH_GROUPS=yes


Sample Interface Configuration

Before configuring MultiNICB:• Ensure that each interface has a unique MAC address.• Modify or create the /etc/hostname.interface files for each interface

to ensure that the interfaces are plumbed and given IP addresses during boot. For Solaris 8 and later, set the deprecated and nofailover flags. In the example given on the slide, S1-qfe3 and S1-qfe4 are the host names corresponding to the test IP addresses assigned to the qfe3 and qfe4 interfaces on the S1 system, respectively. The corresponding test IP addresses are shown in the /etc/hosts file.

• Either reboot or manually configure the interfaces.Note: If you change the local-mac-address? eeprom parameter, you must reboot the systems.

Sample Interface Configuration

Display and set MAC addresses of all MultiNICB interfaces:eepromeeprom local-mac-address?=true

Configure interfaces on each system (Solaris 8 and later):/etc/hostname.qfe3: S1-qfe3 netmask + broadcast + deprecated –failover up

/etc/hostname.qfe4:S1-qfe4 netmask + broadcast + deprecated –failover up

/etc/hosts:10.10.1.3 S1-qfe310.10.1.4 S1-qfe410.10.2.3 S2-qfe310.10.2.4 S2-qfe4

Reboot all systems if you set local-mac-address? to true.Otherwise, you can configure interfaces manually using ifconfig and avoid rebooting.

Test IP Addresses


4Sample MultiNICB Configuration

The example shows a MultiNICB configuration with two interfaces specified: qfe3 and qfe4.

The IPMultiNICB agent uses one of these interfaces to configure an IP alias (virtual IP address) when it is brought online. If an interface alias number is specified with the interface, IPMultiNICB selects the interface that corresponds to the number set in its DeviceChoice attribute (described in the “Configuring IPMultiNICB” section).

Sample MultiNICB ConfigurationExample MultiNICB configuration:hares -modify webSGMNICB Device qfe3 0 qfe4 1

Example main.cf file with interfaces and aliases:MultiNICB webSGMNICB (Device = { qfe3=0, qfe4=1 }NetworkHosts = {”10.10.1.1”, ”10.10.2.2”}

)

The number paired with the interface is used by the IPMultiNICB resource to determine which interface to select to bring up the virtual IP address.

10.10.1.3 qfe3

qfe410.10.1.4

qfe3

qfe4

10.10.2.3

10.10.2.4Test IPs


The IPMultiNICB Resource and AgentThe IPMultiNICB agent monitors a virtual (logical) IP address configured as an alias on one of the interfaces of a MultiNICB resource. If the physical interface on which the logical IP address is configured is marked DOWN by the MultiNICB agent, or a FAILED flag is set on the interface (for Solaris 8), the resource is reported OFFLINE. If multiple service groups have IPMultiNICB resources associated with the same MultiNICB resource, only one group has the MultiNICB resource. The other groups will have a proxy resource pointing to the MultiNICB resource.

The agent functions and the required attributes for the IPMultiNICB resource type are listed on the slide.

The IPMultiNICB Resource and AgentAgent functions:Online Configures an IP alias (known as the virtual or

application IP address) on an active network device in the specified MultiNICB resource

Offline Removes the IP aliasMonitor Determines whether the IP address is up by checking the

export information file written by the MultiNICB resource

Required attributes:BaseResName The name of the MultiNICB resource for

this virtual IP address Address The virtual IP address assigned to the MultiNICB

resource, used by network clientsNetmask The netmask for the virtual IP address


4Configuring IPMultiNICB

Optional Attributes

The optional attribute, DeviceChoice, indicates the preferred physical interface on which to bring the logical IP address online. Specify the device name or interface alias as listed in the Device attribute of the MultiNICB resource.

This example shows DeviceChoice set to an interface:DeviceChoice = "qfe3"

In the next example, DeviceChoice is set to an interface alias:DeviceChoice = "1"

In the second case, MultiNICB brings a logical address online on the qfe4 (assuming that MultiNICB specifies qfe4=1).

Using an alias is advantageous when you have large numbers of virtual IP addresses. For example, if you have 50 virtual IP addresses and you want all of them to try qfe4, you can set Device={qfe3=0, qfe4=1} and DeviceChoice=1. In the event you need to replace the qfe4 interface, you do not need to change DeviceChoice for each of the 50 IPMultiNICB resources. The default for DeviceChoice is 0.

IPMultiNICB oraMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB oraMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

Configuring IPMultiNICBConfiguration prerequisites:– The MultiNICB agent must be running to inform the

IPMultiNICB agent of the available interfaces.– Only one VCS IP agent (IPMultiNICB, IPMultiNIC, or IP) can

control each logical IP address.Optional attribute:

DeviceChoice The device name or interface alias on which to bring the logical IP address online

MultiNICB webSGMNICB (Device = {qfe3=0, qfe4=1}

)

MultiNICB webSGMNICB (Device = {qfe3=0, qfe4=1}

)

IPMultiNICB appMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB appMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB nfsIPMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB nfsIPMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB webSGIPMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)

IPMultiNICB webSGIPMNICB (BaseResName = webSGMNICBAddress = “10.10.10.21"NetMask = "255.0.0.0"DeviceChoice = "1"

)


Switching Between Interfaces

You can use the haipswitch command to manually migrate the logical IP address from one interface to another when you use the MultiNICB and IPMultiNICB resources.

The syntax is:haipswitch MultiNICB_resname IPMultiNICB_resname \ ip_addr netmask from to

haipswitch -s MultiNICB_resname

In the first form, the command performs the following tasks:1 Checks that both from and to interfaces are associated with the specified

MultiNICB resource and that the interface is working If the interface is not working, the command aborts the operation.

2 Removes the IP address on the from logical interface3 Configures the IP address on the to logical interface4 Erases previous failover information created by MultiNICB for this logical IP

address

In the second form, the command shows the status of the interfaces for the specified MultiNICB resource.

This command is useful for switching back to a fixed interface after a failover. For example, if the IP address is normally on a 1Gb Ethernet interface and it fails over to a 100Mb interface, you can switch it back to the higher bandwidth interface when it is fixed.

Switching Between InterfacesYou can use the haipswitch command to move the IP addresses:haipswitch MultiNICB_resname IPMultiNICB_resname \ip_addr netmask from_interface to_interface

The command is located in the directory: /opt/VRTSvcs/bin/IPMultiNICB

You can also check the status of the resource using haipswitch in this form: haipswitch -s MultiNICB_resname


4The MultiNICB TriggerVCS provides a trigger named multinicb_postchange to notify you when MultiNICB resources change state. This trigger can be used to alert you to problems with network interfaces that are managed by the MultiNICB agent.

When an interface fails, VCS does not fault the MultiNICB resource until there are no longer any working interfaces defined in the Device attribute. Although the log indicates when VCS fails an IP address between interfaces, the ResFault trigger is not run. If you configure multinicb_postchange, you receive active notification of changes occurring in the MultiNICB configuration.

The MultiNICB TriggerYou can configure a trigger to notify you of changes in the state of MultiNICB resources.

The trigger is invoked at the first monitor cycle and during state transitions.The trigger script must be named multinicb_postchange.The script must be located in:/opt/VRTSvcs/bin/triggers/multinicbA sample script is provided.


Example MultiNIC SetupCluster InterconnectOn each system, two interfaces from different network cards are used by LLT for VCS communication. These interfaces may be connected by crossover cables or by means of a network hub or switch for each link.

Base IP AddressesThe network interfaces used for the MultiNICA or MultiNICB resources (ports 3 and 4 on the slide) should be configured with the specified base IP addresses by the operating system during system startup. These base IP addresses are not used by applications. The addresses are used by VCS resources to check the network connectivity. Note that if you use MultiNICA, you need only one base IP address per system. However, if you use MultiNICB, you need one base IP address per interface.

NIC and IP ResourcesThe network interface shown as port2 is used by an IP and a NIC resource. This interface also has an administrative IP address configured by the operating system during system startup.

MultiNICA and IPMultiNIC, or MultiNICB and IPMultiNICBThe network interfaces shown as port3 and port4 are used by VCS for local interface failover. These interfaces are connected to separate hubs to eliminate single points of failure. The only single point of failure for the MultiNICA or MultiNICB resource is the quad Ethernet card on the system. You can also use interfaces on separate network cards to eliminate this single point of failure.

Example MultiNIC Setup

Hub 1

port0port1

192.168.27.101 port210.10.1.3 port3

Wall

port0

System2

port2 192.168.27.102port3 10.10.2.3

Hub 2Wall

System1

Heartbeat

MultiNIC IPNIC IP

To Wall

port4 port4

port1

Required for MultiNICB only

(10.10.1.4) (10.10.2.4)


4Comparing MultiNICA and MultiNICB

Advantages of Using MultiNICA and IPMultiNIC• Physical interfaces can be plumbed as needed by the agent, supporting an

active/passive configuration.• MultiNICA requires only one base IP address for the set of interfaces under its

control. This address can also be used as the administrative IP address for the system.

• MultiNICA does not require all interfaces to be part of a single IP subnet.

Advantages of Using MultiNICB and IPMultiNICB• All interfaces under a particular MultiNICB resource are always configured

and have test IP addresses to speed failover.• MultiNICB failover is many times faster than that of MultiNICA.• Support for single and multiple interfaces eliminates the need for separate pairs

of NIC and IP, or MultiNICA and IPMultiNIC, for these interfaces.• MultiNICB and IPMultiNICB support failback of IP addresses.• MultiNICB and IPMultiNICB support manual movement of IP addresses

between working interfaces under the same MultiNICB resource without changing the VCS configuration or disabling resources.

MultiNICB and IPMultiNICB support IP multipathing, interface groups, and trunked ge and qfe interfaces.

Comparing MultiNICA and MultiNICBMultiNICA and IPMultiNIC:– Supports active/passive– Requires only one base IP – Does not require a single IP subnet

MultiNICB and IPMultiNICB:– Requires an IP address for each interface– Fails over faster and supports failback and

migration– Supports single and multiple interfaces– Supports IP multipathing and trunking– Solaris-only


Testing Local Interface FailoverTest the interface using the procedure shown in the slide. This enables you to determine where the virtual IP address is configured as different interfaces are faulted.

Note: To detect faults with the network interface faster, you may want to decrease the monitor interval for the MultiNICA (or MultiNICB) resource type:hatype -modify MultiNICA MonitorInterval 15

However, this has a potential impact on network traffic that results from monitoring MultiNICA resources. The monitor function pings one or more hosts on the network for every cycle.

Note: The MonitorInterval attribute indicates how often the Monitor script should run. After the Monitor script starts, other parameters control how many times that the target hosts are pinged and how long the detection of a failure takes. To minimize the time that it takes to detect that an interface is disconnected, reduce the HandshakeInterval attribute of the MultiNICA resource type:hatype -modify MultiNICA HandshakeInterval 60

Testing Local Interface Failover1. Bring the resources online.2. Use netstat to determine where the

IPMultiNIC/IPMultiNICB IP address is configured.3. Unplug the network cable from the network interface

hosting the IP address.4. Observe the log and the output of netstat or

ifconfig to verify that the administrative and virtual IP addresses have migrated to another network interface.

5. Unplug the cables from all interfaces.6. Observe the virtual IP address fail over to the other

system.


4SummaryThis lesson described several sample design requirements related to the storage and network components of an application service, and it provided solutions for the sample designs using VCS resources and attributes. In particular, this lesson described the VCS resources related to third-party volume management software and local NIC failover.

Next StepsThe next lesson describes common maintenance procedures you perform in a cluster environment.

Additional Resources• VERITAS Cluster Server Bundled Agents Reference Guide

This document provides important reference information for the VCS agents bundled with VERITAS Cluster Server.

• VERITAS Cluster Server User’s GuideThis guide explains important VCS concepts, including the relationship between service groups, resources, and attributes, and how a cluster operates. This guide also introduces the core VCS processes.

• IP Network Multipathing Administration Guide This guide is provided by Sun as a reference for implementing IP multipathing.

Lesson SummaryKey Points – VCS includes agents to manage storage

resources on different UNIX platforms.– You can configure multiple network interfaces

for local failover to increase high availability.Reference Materials– VERITAS Cluster Server Bundled Agents

Reference Guide– VERITAS Cluster Server User's Guide– Sun IP Network Multipathing Administration

Guide


Lab 4: Configuring Multiple Network InterfacesLabs and solutions for this lesson are located on the following pages.

Appendix A provides brief lab instructions for experienced students.• “Lab 4 Synopsis: Configuring Multiple Network Interfaces,” page A-20

Appendix B provides step-by-step lab instructions.• “Lab 4 Details: Configuring Multiple Network Interfaces,” page B-37

Appendix C provides complete lab instructions and solutions.• “Lab 4 Solution: Configuring Multiple Network Interfaces,” page C-63

GoalThe purpose of this lab is to replace the NIC and IP resources with their MultiNIC counterparts.

ResultsYou can switch between network interfaces on one system without causing a fault and observe failover after forcing both interfaces to fault.

PrerequisitesObtain any classroom-specific values needed for your classroom lab environment and record these values in your design worksheet that is included with the lab exercise instructions.

Lab 4: Configuring Multiple Network Interfaces

nameProcess2

AppVol

AppDG

nameProxy2

nameIP2

nameDG2

nameVol2

nameMount2

nameProcess1

nameDG1

nameVol1

nameMount1

nameProxy1

nameIPM1

NetworkMNIC

NetworkPhantom

nameSG1nameSG1 nameSG2nameSG2

NetworkSGNetworkSG

NetworkNIC

Lesson 5Maintaining VCS


IntroductionOverviewThis lesson describes how to maintain a VCS cluster. Specifically, this lesson shows how to replace hardware, upgrade the operating system, and upgrade software in a VCS cluster.

ImportanceA good high availability design should take into account planned downtime as much as unplanned downtime. In today’s rapidly changing technical environment, it is important to know how you can minimize downtime due to the maintenance of hardware and software resources after you have your cluster up and running.

Lesson Introduction


Lesson 5 Maintaining VCS 5–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

5

Outline of Topics• Making Changes in a Cluster Environment• Upgrading VERITAS Cluster Server • Alternative VCS Installation Methods• Staying Informed

Obtain the latest information about your version of VCS.

Staying Informed

Install VCS using alternative methods.Alternative VCS Installation Methods

Upgrade VCS to version 4.0 from earlier versions.

Upgrading VERITAS Cluster Server

Describe guidelines and examples for modifying the cluster environment.

Making Changes in a Cluster Environment


Topic



Making Changes in a Cluster EnvironmentReplacing a SystemCluster systems may need to be replaced for one of these reasons:• A system experiences hardware problems and needs to be replaced.• A system needs to be replaced for performance reasons.

To replace a running system, see the “Workshop: Reconfiguring Cluster Membership” lesson.

Note: Changing the hardware machine type may have an impact on the validity of the existing VCS license. You may need to apply for a new VCS license before replacing the system. Contact VERITAS technical support before making any changes.

Replacing a System When you must replace a cluster system, consider:

Changes in system type may impact VCS licensing. Check with VERITAS support.Although not a strict requirement, you are recommended to use the same operating system version on the new system as the other systems in the cluster. The new system should have the same version of any VERITAS products that are in use on the other systems in the cluster.Changes in device names may have an impact on the existing VCS configuration. For example, device name changes may affect the network interfaces used by VCS resources.


5

Preparing for Software and Hardware UpgradesWhen planning to upgrade any component in the cluster, consider how the upgrade process will impact service availability and how that impact can be minimized.

First, verify that the component, such as an application, is supported by VCS and, if applicable, the Enterprise agent.

It is also important to have a recent backup of both the systems and the user data before you make any major changes on the systems in the cluster.

If possible, always test any upgrade procedure on nonproduction systems before making changes in a running cluster.

Preparing for Software and Hardware UpgradesIdentify the configuration tasks that you can perform prior to the upgrade to minimize downtime.– User accounts– Application configuration files– Mount points– System or network configuration files

Ensure that you have a recent backup of the systems and the user data.If available, implement changes in a test cluster first.


Operating System Upgrade ExampleBefore making changes or upgrading an operating system, verify the compatibility of the planned changes with the running VCS version. If there are incompatibilities, you may need to upgrade VCS at the same time as upgrading the operating system.

To install an operating system update that does not require a reboot on the systems in a cluster, you can minimize the downtime of VCS-controlled applications using this procedure:1 Freeze the system to be updated persistently. This prevents applications from

failing over to this system while maintenance is being performed.2 Switch any online applications to other systems.3 Install the update.4 Unfreeze the system.5 Switch applications back to the newly updated system. Test to ensure that the

applications run properly on the updated system.6 If the update has caused problems, switch the applications back to a system

that has not been updated.7 If the applications run properly on the updated system, continue updating other

systems in the cluster by following steps 1-6 for each system.8 Migrate applications to the appropriate system.

Operating System Upgrade Example

Web RequestsWeb Requests

Web ServerWeb Server

Operating System UpgradeOperating System Upgrade

Freeze


5

Performing a Rolling Upgrade in a Running ClusterSome applications support rolling upgrades. That is, you can run one version of the application on one system and a different version on another system. This enables you to move the application service to another system and keep it running while you upgrade the first system.

Rolling Upgrade Example: VxVM

VERITAS Volume Manager is an example of a product that enables you to perform rolling upgrades.

The diagram in the slide shows a general procedure for performing rolling upgrades in a cluster that can be applied to upgrading any application that supports rolling upgrades. This procedure applies to upgrades requiring a system reboot.

For the specific upgrade procedure for your release of Volume Manager, refer to the VERITAS Volume Manager Installation Guide.

Notes: • Because some of these procedures require the complete removal of the

VERITAS Volume Manager packages as well as multiple reboots, you need to stop VCS completely on the system while carrying out the upgrade procedure.

• Upgrading VxVM does not automatically upgrade the disk group versions. You can continue to use the disk group created with an older version. However, any new features may not be available for the disk group until you carry out a manual upgrade of the disk group version. Upgrade the disk group version only after you upgrade VxVM on all the systems in the cluster. After you upgrade the disk group version, older versions of VxVM cannot import it.

Rolling Upgrade Example: VxVM

More systemsto upgrade?

Move groups to appropriate systems: hagrp -switch mySG -to S1

Close the configuration:haconf -dump -makero

Freeze and evacuate the system:hasys -freeze –persistent \

-evacuate S1

Save the configurationand stop VCS on the system:haconf –dump –makero

hastop -sys S1

Perform the VxVM upgrade according to the Release Notes.

Unfreeze the system:haconf –makerw

hasys –unfreeze \–persistent S1

Open the configuration:haconf -makerw

Done

N

Y

If desired, upgrade the disk group version on the system where the disk

group is imported:vxdg upgrade dgname


Upgrading VERITAS Cluster ServerPreparing for a VCS UpgradeIf you already have a VCS cluster runningthat is using an earlier version of VCS (prior to 4.x), you can upgrade the software while preserving your current cluster configuration. However, VCS does not support rolling upgrades. That is, you cannot run one version of VCS on one system and a different version on another system in the cluster.

While upgrading VCS, your applications can continue to run, but they are not protected from failure. Consider which tasks you can perform in advance of the actual upgrade procedure to minimize the interval while VCS is not running and your applications are not highly available.

With any software upgrade, the first step should be to back up your existing VCS configuration. Then, contact VERITAS to determine whether there are any situations that require special procedures. Although the procedure to upgrade to VCS version 4.x is provided in this lesson, you must check the release notes before attempting to upgrade. The release notes provide the most up-to-date information on how to upgrade from an earlier version of software.

If you have a large cluster with many different service groups, consider automating certain parts of the upgrade procedure, such as freezing and unfreezing service groups.

If possible, test the upgrade procedure on a nonproduction environment first.

Preparing for a VCS UpgradeDetermine which tasks you can perform in advance to minimize VCS downtime.Back up the VCS configuration (hasnap or hagetcf).Contact VERITAS Technical Support.Acquire the new VCS software.Obtain VCS licenses, if necessary.Read the release notes.Consider automating tasks with scripts. Deploy on a test cluster first.


5

Upgrading to VCS 4.x from VCS 1.3—3.5When you run installvcs on cluster systems that run VCS version 1.3.0, 2.0, or 3.5, you are guided through an upgrade procedure. • For VCS 2.0 and 3.5, before starting the actual installation, the utility updates

the cluster configuration (including the ClusterService group and the types.cf file) to match version 4.x.

• For VCS 1.3.0, you must configure the ClusterService group manually. Refer to the VERITAS Cluster Server Installation Guide. After stopping VCS on all systems and uninstalling the previous version, installvcs installs and starts VCS version 4.x.

In a secure environment, run the installvcs utility on each system to upgrade a cluster to VCS 4.x. On the first system, the utility updates the configuration and stops the cluster before upgrading the system. On the other systems, the utility uninstalls the previous version and installs VCS 4.x. After the final system is upgraded and started, the upgrade is complete.

You must upgrade VCS versions prior to 1.3.0 manually using the procedures listed in the VERITAS Cluster Server Installation Guide.

Upgrading to VCS 4.x from VCS 1.3? 3.5Use the installvcs utility to automaticallyupgrade VCS.

The installvcs utility updates the version 2.0 and 3.5 cluster configuration to match version 4.x, including the ClusterService group and types.cf.You must configure the ClusterService group manually if you are upgrading to version 4.x from version 1.3.0.To upgrade VCS in a secure environment, run installvcs on each cluster system.


Upgrading from VCS QuickStart to VCS 4.xUse the installvcs -qstovcs option to upgrade systems running VCS QuickStart version 2.0, 3.5, or 4.0 to VCS 4.x. During the upgrade procedure, you must add a VCS license key to the systems. After the systems are properly licensed, the utility modifies the configuration, stops VCS QuickStart, removes the packages for VCS QuickStart (which include the Configuration Wizards and the Web GUI), and adds the VCS packages for documentation and the Web GUI. When restarted, the cluster runs VCS enabled with full functionality.


5

Other Upgrade ConsiderationsYou may need to upgrade other VCS components, as follows:• Configure fencing, if supported in your environment. Fencing is supported in

VCS 4.x with VxVM 4.x and shared storage devices with SCSI-3 persistent reservations.

• Check whether any Enterprise agents have new versions and upgrade them, if necessary. These agents may have bug fixes or new features of benefit to your cluster environment.

• Upgrade the Java Console, if necessary. For example, earlier versions of the Java Console cannot run on VCS 4.x.

• Although you can use uninstallvcs to automate portions of the upgrade process, you may need to also perform some manual configuration to ensure that customizations are carried forward.

Other Upgrade ConsiderationsManually configure fencing when upgrading to VCS 4.x if shared storage supports SCSI-3 persistent reservations. Check for new Enterprise agents and upgrade them, if appropriate.Upgrade the Java Console, if necessary.Reapply any customizations, if necessary, such as triggers or modifications to agents.


Alternative VCS Installation MethodsOptions to the installvcs UtilityVCS provides an installation utility (installvcs) to install the software on all the systems in the cluster and perform initial cluster configuration.

You can also install the software using the operating system command to add software packages individually on each system in the cluster. However, if you install the packages individually, you also need to complete the initial VCS configuration manually by creating the required configuration files.

The manual installation method is described later in this lesson.

Options and Features of the installvcs Utility

Using installvcs in a Secure Environment

In some Enterprise environments, ssh or rsh communication is not allowed between systems. If the installvcs utility detects communication problems, it prompts you to confirm that it should continue the installation only on the systems with which it can communicate (most often this is just the local system). A response file (/opt/VRTS/install/logs/installvcsdate_time.response) is created that can then be copied to the other systems. You can then use the -responsefile option to install and configure VCS on the other systems using the values from the response file.

Alternative VCS Installation MethodsThe installvcs utility supports several options for installing VCS:– Automated installation on all cluster systems,

including configuration and startup (default) – Installation in a secure environment by way of the

unattended installation feature: -responsefile

– Installation without configuration: -installonly– Configuration without installation: -configure

You can also manually install VCS using the operating system command for adding software packages.


5

You can also use this option to perform unattended installation. You can manually assign values to variables in the installvcsdate_time.response file based on your installation environment. This information is passed to the installvcs script.

Note: Until VCS is installed and started on all systems in the cluster, an error message is displayed when VCS is started.

Using installvcs to Install Without Configuration

You can install the VCS packages on a system before they are ready for cluster configuration using the -installonly option. The installation program licenses and installs VCS on the systems without creating any VCS configuration files.

Using installvcs to Configure Without Installation

If you installed VCS without configuration, use the -configure option to configure VCS. The installvcs utility prompts for cluster information and creates VCS configuration files without performing installation of VCS packages.

Upgrading VCS

When you run installvcs on cluster systems that run VCS 2.0 or VCS 3.5, the utility guides you through an upgrade procedure.


Manual Installation ProcedureUsing the manual installation method individually on each system is appropriate when:• You are installing a single VCS package.• You are installing VCS to a single system.• You do not have remote root access to other systems in the cluster.

The VCS installation procedure using the operating system installation utility, such as pkgadd on Solaris, requires administrator access to each system in the cluster. The installation steps are as follows:1 Install VCS packages using the appropriate operating system installation

utility.2 License the software using vxlicinst.3 Configure the files /etc/llttab, /etc/llthosts, and /etc/gabtab

on each system. 4 Configure fencing, if supported in your environment.5 Configure /etc/VRTSvcs/conf/config/main.cf on one system in

the cluster.6 Manually start LLT, GAB, and HAD to bring the cluster up without any

services.7 Configure high availability services.

Manual Installation Procedure

StartStart Install VCS packages using the platform-specific install utility.

Install VCS packages using the platform-specific install utility.

Enter license keys using vxlicinst.

Enter license keys using vxlicinst.

Configure main.cf.Configure main.cf.

Start LLT, GAB, fencing, and then HAD.

Start LLT, GAB, fencing, and then HAD.

Configure other services.Configure other services.

Configure the cluster interconnect.Configure the cluster interconnect.

Configure fencing, if used.Configure fencing, if used.

DoneDone


5

Notes:• Start the cluster on the system with the main.cf file that you have created.

Then start VCS on the remaining systems. Because the systems share an in-memory copy of main.cf, the original copy is shared with the other systems and copied to their local disks.

• Install Cluster Manager (the VCS Java-based graphical user interface package), VRTScscm, after VCS is installed.


Licensing VCSVCS is a licensed product. Each system requires a license key to run VCS. If VCS is installed manually, or if you are upgrading from a demo to permanent license:1 Shut down VCS and keep applications running.

hastop -all -force2 Run the vxlicinst utility on each system.

vxlicinst -k XXXX-XXXX-XXXX-XXXX3 Restart VCS on each system.

hastart

Checking License Information

VERITAS provides a utility to display license information, vxlicrep. Executing this command displays the product licensed, the type of license (demo or permanent), and the license key. If the license is a “demo,” an expiration date is also displayed.

To use the vxlicrep utility to display license information:vxlicrep

Licensing VCSThere are two cases in which a VCS license may needto be added or updated using vxlicinst:

VCS is installed manually.A demo license is upgraded to a demo extension or a permanent license.

To install a license:1. Stop VCS.2. Run vxlicinst on each system:

vxlicinst -k key3. Restart VCS on each system.To display licenses of all VERITAS products, usethe vxlicrep command.


5

Creating a Single-Node ClusterYou may want to create a one-node cluster for test purposes, or as a failover cluster in a disaster recovery plan that includes VERITAS Volume Replicator and VERITAS Global Cluster Option (formerly VERITAS Global Cluster Manager). The single-node cluster can be in a remote secondary location, ready to take over applications from the primary site in case of a site outage.

Creating a Single-Node ClusterYou can install VCS on a single system as follows:

Install the VCS software using the platform-specific installation utility or installvcs.Remove any LLT or GAB configuration and startup files, if they exist. Create and modify the VCS configuration files as necessary.

– Modify the VCS startup file for single-node operation. Change the HASTART line to:HASTART="/opt/VRTSvcs/bin/hastart -onenode"

– Start VCS and verify single-node operation: hastart -onenode

– Modify the VCS startup file for single-node operation. Change the HASTART line to:HASTART="/opt/VRTSvcs/bin/hastart -onenode"

– Start VCS and verify single-node operation: hastart -onenode

– Start VCS normally using hastart.VCS 4.x checks main.cf and automatically runs hastart –onenode if there is only one system listed.

– Start VCS normally using hastart.VCS 4.x checks main.cf and automatically runs hastart –onenode if there is only one system listed.

3.53.5

4.x4.x


Staying Informed Obtaining Information from VERITAS SupportWith each new release of the VERITAS products, changes are made that may affect the installation or operation of VERITAS software in your environment. By reading version release notes and installation documentation that are included with the product, you can stay informed of any changes.

For more information about specific releases of VERITAS products, visit the VERITAS Support Web site at: http://support.veritas.com. You can select the product family and the specific product that you are interested in to find detailed information about each product.

You can also sign up for the VERITAS E-mail Notification Service to receive bulletins about products that you are using.

Obtaining Information from VERITAS Support


5

SummaryThis lesson introduced various procedures to maintain the systems in a VCS cluster while minimizing application downtime. Specifically, replacing system hardware, upgrading operating system software, upgrading VERITAS Storage Foundation, and upgrading and patching VERITAS Cluster Server have been discussed in detail.

Next StepsThe next lesson discusses the process of deploying a high availability solution using VCS and introduces some best practices.

Additional Information• VERITAS Cluster Server Installation Guide

This guide provides information on how to install and upgrade VERITAS Cluster Server (VCS) on the specified platform.

• VERITAS Cluster Server User’s GuideThis document provides information about all aspects of VCS configuration.

• VERITAS Volume Manager Installation GuideThis document provides information on how to install and upgrade VERITAS Volume Manager.

• http://support.veritas.comContact VERITAS Support for information about installing and updating VCS and other software and hardware in the cluster.

Lesson SummaryKey Points – Use these guidelines to determine the

appropriate installation and upgrade methods for your cluster environment.

– Access the VERITAS Support Web site for information about VCS.

Reference Materials– VERITAS Cluster Server Installation Guide– VERITAS Cluster Server User's Guide– VERITAS Volume Manager Installation Guide– http://support.veritas.com


Lesson 6Validating VCS Implementation


IntroductionOverviewThis lesson provides a review of best practices discussed throughout the course. The lesson concludes with a discussion of verifying that the implementation of your high availability environment meets your design criteria.

ImportanceBy verifying that your site is properly implemented and configured according to best practices, you ensure the success of your high availability solution.

Lesson Introduction


Lesson 6 Validating VCS Implementation 6–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

6

Outline of Topics• VCS Best Practices Review• Solution Acceptance Testing• Knowledge Transfer • High Availability Solutions

Describe other high availability solutions and information references.

High Availability Solutions

Transfer knowledge to other administrative staff.

Knowledge Transfer

Plan for solution acceptance testing.Solution Acceptance Testing

Describe best practice recommendations for VCS.

VCS Best Practices Review


Topic



VCS Best Practices ReviewThis section provides a review of best practices for optimal configuration of a high availability environment using VCS. These best practice recommendations have been described throughout this course; they are summarized here as a review and reference tool. You can use this information to review your cluster configuration, and then perform the final testing, verification, and knowledge transfer activities to conclude the deployment phase of the high availability implementation project.

Cluster InterconnectThe more robust your cluster interconnect, the less risk you have of downtime due to failures or a split brain condition.

If you are using fencing in your cluster, you have no risk of a split brain condition occurring. In this case, failure of the cluster interconnect results only in downtime while systems reboot and applications fail over. Having redundant links for the cluster interconnect to maintain the cluster membership ensures the highest availability of service.

For clusters that do not use fencing, robustness of the cluster interconnect is critical. Configure at least two Ethernet networks with completely separate interconnects to minimize the risk that all links can fail simultaneously. Also, configure a low-priority link on the public or administrative interface. The performance impact is imperceptible when the Ethernet interconnect is functioning, and the added level of protection is highly recommended.

Note: Do not configure multiple low-priority links on the same public network. LLT will report lost and delayed heartbeats in this case.

Cluster InterconnectConfigure two Ethernet LLT links with separate infrastructures for the cluster interconnect.Ensure that there are no single points of failure.– Do not place both LLT links on interfaces on

the same card.– Use redundant hubs or switches.

Ensure that no routers are in the heartbeat path.Configure a low-priority link on the public network for additional redundancy.


6

Shared StorageIn addition to the recommendations listed in the slide, consider using similar or identical hardware configurations for systems and storage devices in the cluster. Although not a requirement, this simplifies administration and management.

Note: You may require different licenses for VERITAS products depending on the type of systems used in the cluster.

Shared StorageConfigure redundant interfaces to redundant shared storage arrays.Shared disks on a SAN must reside in the same zone as all nodes in the cluster.Use a volume manager and file system that enable you to make changes to a running configuration. Mirror all data used within the HA environment across storage arrays.Ensure that all cluster data is included in the backup scheme and periodically test restoration.


Public NetworkHardware redundancy for the public network maximizes high availability for application services requiring network access. While a configuration with only one public network connection for each cluster system still provides high availability, loss of that connection incurs downtime while the application service fails over to another system.

To further reduce the possibility of downtime, configure multiple interfaces to the public network on each system, each with its own infrastructure, including hubs, switches, and interface cards.

Public NetworkA dedicated administrative IP address must be allocated to each node of the cluster.This address must not be failed over to any other node.One or more IP addresses should be allocated for each service group requiring client access.DNS entries should map to the application (virtual) IP addresses for the cluster.When specifying NetworkHosts for the NIC resource, specify more than one highly available IP addresses. Do not specify localhost.The highly available IP addresses should be noted in the hosts file.


6

Failover ConfigurationBe sure to review each resource to determine whether it is critical enough to the service to cause failover in the event of a fault. Be aware that all resources are set to Critical by default when initially created.

Also, ensure that you understand how each resource and service group attribute affects failover. You can use the VCS Simulator to model how to apply attribute values to determine failover behavior before you implement them in a running cluster.

Failover ConfigurationEnsure that each resource required to provide a service is marked as Critical to enable automatic failover in the event of a fault.If a resource should not cause failover if it faults, be sure to set Critical to 0. When you initially configure resources, they are set to Critical by default.Use appropriate resource and service group attributes, such as RestartLimit, ManageFaults, and FaultPropagation, to refine failover behavior.


External DependenciesWhere possible, minimize any dependency by high availability services on resources outside the cluster environment. By doing so, you reduce the possibility that your services are affected by failures external to the cluster.

External DependenciesEnsure that there are no dependencies on external resources that can hinder a failover, such as NFS remote mounts or NIS.Ensure that other resources, such as DNS and gateways, are highly available and set.Consider using local /etc/hosts files for HA services that rely on network resources within the cluster, rather than using DNS.


6

TestingOne of the most critical aspects of implementing and maintaining a cluster environment is to thoroughly verify the configuration in a test cluster environment. Furthermore, test each change to the configuration in a methodical fashion to simplify problem discovery, diagnosis, and solution.

Only after you are satisfied with the cluster operating in the test environment, deploy the configuration to a production environment.

TestingMaintain a test cluster and try out any changes before modifying your production cluster.Use the Simulator to try configuration changes.Before considering the cluster operational, thoroughly test all failure scenarios.Create a set of acceptance tests that can be run whenever you change the cluster environment.


Other ConsiderationsSome additional recommendations for effectively implementing and managing you high availability VCS environment are:• A key overriding concept for successful implementation and subsequent

management of a high availability environment is simplicity of design and configuration. Minimizing complication within the cluster helps simplify day-to-day management and troubleshooting problems that may arise.

• Commands, such as reboot and halt, stop the system without running the init-level scripts. This means that VCS is not shut down gracefully. In this case, when the system restarts, service groups are autodisabled and do not start up automatically. Consider renaming these commands and creating scripts in their place that echo a reminder message that describes the effects on cluster services.

Other ConsiderationsKeep your high availability design and implementation simple. Unnecessary complexity can hinder troubleshooting and increase downtime.Consider renaming commands, such as rebootand halt, and creating scripts in their place. This can protect you against ingrained practices by administrators that can adversely affect high availability.


6

Solution Acceptance TestingUp to this point, the deployment phase should have been completed according to the plan resulting from the design phase. After completing the deployment phase, perform solution acceptance testing to ensure that the cluster configuration meets the requirements established at project initiation. Involve critical staff who will be involved in maintaining the cluster and the highly available application services in the acceptance testing process, if possible. Doing so helps ensure a smooth transition from deployment to maintenance.

Solution-Level Acceptance TestingPart of an implementation planDemonstrates that the HA solution meets users’ requirementsSolution-oriented, but includes individual feature testingRecommended that you have predefined testsExecuted at the final stage of the implementation


Examples of Solution Acceptance TestingVERITAS recommends that you develop a solution acceptance test plan. The example in the slide shows items to check to confirm that there are no single points of failure in the HA environment.

A test plan of this nature, at minimum, documents the criteria that the system test must meet in order to ensure that the deployment was successful and complete.

Note: The solution acceptance test recommendations described here should be inclusive, and not exclusive, of other appropriate tests that you may decide to run.

Examples of Solution Acceptance TestingSolution-level testing:

Demonstrate major HA capabilities, such as:- Manual and automatic application failover- Loss of public network connections- Server failure- Cluster interconnect failure

GoalVerify and demonstrate that the high availability solution is working correctly and satisfies the design requirements.SuccessComplete the tests demonstrating expected results.


6

Knowledge TransferKnowledge transfer can be divided into product functionality and administration considerations.

If the IT staff who will maintain the cluster are participating in the solution acceptance testing as is strongly recommended then this time can be used to explain how VERITAS products—individually and integrated—function in the HA environment.

Note: Knowledge transfer is not a substitute for formal instructor-led classes or Web-based training. Knowledge transfer focuses on communicating the specific details of the implementation and its effects on application services.

System and Network AdministrationThe installation of a high availability solution that includes VERITAS Cluster Server has implications on the administration and maintenance of the servers in the cluster. For example, to maintain high availability, VCS nodes should not have any dependencies on systems outside of the cluster.

Network administrators need to understand the impact of losing network communications in the cluster and also the impact of configuring a low-priority link on the public network.

System and Network AdministratorsDo system administrators understand that clustered systems should not rely on services outside the cluster?– The cluster node should not be an NIS client of a server

outside of the cluster.– The cluster node should not be an NFS client.

Do network administrators understand the impact of bringing the network down?Potential for causing network partitions and split brain

Do network administrators understand the effect of having a low-priority cluster interconnect link on the public network?


Application AdministrationApplication and database administration are also affected by the implementation of an HA solution. Upgrade and maintenance procedures for applications vary depending on whether the binaries are placed on local or shared storage. Also, because applications are now under VCS control, startup and shutdown scripts need to be either removed or renamed in the run control directories. If application data is stored on file systems, those file systems need to be removed or commented out of the file system table.

For example, if an Oracle administrator is performing hot backups on an Oracle database under VCS control, the administrator needs to be aware that, by default, even though VCS fails over the instance, Oracle will not be able to open the database and therefore availability will be compromised. Setting the AutoEndBkup attribute of the Oracle resource tells Oracle to take the database table spaces out of backup mode before attempting to start the instance.

Application AdministratorsDo DBAs understand the impact of VCS on theirenvironment?

Application binaries and control filesShared versus local storage– Vendor-dependent– Maintenance ease

Application shutdownUse the service group and system freeze option.

Oracle-specific– Instance failure during hot backup may prevent the

instance from coming online on a failover node.– VCS can be configured to take table spaces out of

backup mode.


6

The Implementation ReportVERITAS recommends that you keep a daily log to describe the progress of the implementation and document any known problems or issues that arise. You can use the log to compile a summary or detailed implementation report as part of the transition to the staff who will maintain the cluster when deployment is complete.

The Implementation ReportDaily activity logDocument the entire deployment process.

Periodic reportingProvide interim reporting if appropriate for the duration of the deployment.

Project handoff document– Include the solution acceptance testing report.– Summarize daily log or periodic reports, if

completed.– Large reports may warrant an overview section

providing the net result with the details inside.


High Availability Solutions VCS can be used in a variety of solutions, ranging from local high availability clusters to multisite wide area disaster recovery configurations.

These solutions are described in more detail throughout this section.

Local Cluster with Shared StorageThis configuration was covered by this course material in detail.• Single site on one campus• Single cluster architecture• SAN or dual-initiated shared storage

Local Clustering with Shared Storage

LAN

Environment– One cluster located at a single site– Redundant servers, networks, and

storage for applications and databases

Advantages– Minimal downtime for applications

and databases– Redundant components

eliminating single points of failure– Application and database

migrationDisadvantagesData center or site can be a single point of failure in a disaster


6

Campus or Metropolitan Shared Storage Cluster• Two different sites within close proximity to each other• Single cluster architecture, but stretched across farther distance, subject to

latency constraints• Instead of a single storage array, data is mirrored between arrays with

VERITAS Storage Foundation (formerly named Volume Manager).

Campus/Stretch ClusterEnvironment– A single cluster stretched over multiple

locations, connected through a single subnet and fibre channel SAN

– Storage mirrored between cluster nodes at each location

Advantages– Provides local high availability within each site

and protection against site failure – Servers placed in multiple sites– Cost-effective solution? no need for replication– Quick recovery – Allows for data center expansion – Leverages the existing infrastructure

Disadvantages– Cost? requires a SAN infrastructure– Distance limitations


Replicated Data Cluster (RDC)• Two different sites within close proximity to each other, stretched across

farther distance• Replication used for data consistency instead of Storage Foundation mirroring

Replicated Data ClusterEnvironment– One cluster? with a minimum of two servers; one

server at each location, for replicated storage – Cluster stretches between multiple buildings, data

centers, or sites connected by way of Ethernet (IP)Advantages– Can use IP rather than SAN (with VVR) – Cost? does not require a SAN infrastructure – Protection against disasters local to a building,

data center, or site – Leverages the existing Ethernet connection

Disadvantages– A more complex solution – Synchronous replication required


6

Wide Area Network (WAN) Cluster for Disaster Recovery• Multiple sites with no geographic limitations• Two or more clusters on different subnets• Replication used for data consistency, with more complex failover control

Wide Area Network Cluster for Disaster Recovery

EnvironmentMultiple clusters provide local failover and remote site takeover for distance disaster recoveryAdvantages– Can support any distance using IP– Multiple replication solutions– Multiple clusters for local failover

before remote takeover– Single point monitoring of all clusters

DisadvantagesCost of a remote hot site


High Availability ReferencesUse these references as resources for building a complete understanding of high availability environments within your organization.• The Resilient Enterprise: Recovering Information Services from Disasters

The Resilient Enterprise explains the nature of disasters and their impacts on enterprises, organizing and training recovery teams, acquiring and provisioning recovery sites, and responding to disasters.

• Blueprints for High Availability: Designing Resilient Distributed SystemsProvides the tools to deploy a system with a step-by-step guide through the building of a network that runs with high availability, resiliency, and predictability

• High Availability Design, Techniques, and ProcessesA best practice guide on how to create systems that will be easier to maintain, including anticipating and preventing problems, and defining ongoing availability strategies that account for business change

• Designing Storage Area NetworksThe text offers practical guidelines for using diverse SAN technologies to solve existing networking problems in large-scale corporate networks. With this book you learn how the technologies work and how to organize their components into an effective, scalable design.

High Availability References The Resilient Enterprise: Recovering Information Services from Disasters by Evan Marcus and Paul MassigliaBlueprints for High Availability: Designing Resilient Distributed Systems by Evan Marcus and Hal SternHigh Availability Design, Techniques, and Processes by Floyd Piedad and Michael HawkinsDesigning Storage Area Networks by Tom ClarkStorage Area Network Essentials: A Complete Guide to Understanding and Implementing SANs (VERITAS Series) by Richard Barker and Paul MassigliaVERITAS High Availability Fundamentals Web-based training


6

• Storage Area Network Essentials: A Complete Guide to Understanding and Implementing SANs (VERITAS Series)Identifies the properties, architectural concepts, technologies, benefits, and pitfalls of storage area networks (SANs)The authors explain the fibre channel interconnect technology and which software components are necessary for building a storage network; they also describe strategies for moving an enterprise from server-centric computing with local storage to a storage-centric information processing environment in which the central resource is universally accessible data.

• VERITAS High Availability Fundamentals Web-based trainingThis course gives an overview of high availability concepts and ideas. The course goes on to demonstrate the role of VERITAS products in realizing high availability to reduce downtime and enhance the value of business investments in technology.


VERITAS High Availability CurriculumNow that you have gained expertise using VERITAS Cluster Server in local area shared storage configurations, you can build on this foundation by completing the following instructor-led courses.

High Availability Design Using VERITAS Cluster Server

This future course enables participants to translate high availability requirements into a VCS design that can be deployed using VERITAS Cluster Server.

VERITAS Cluster Server Agent Development

This course enables participants to create and modify VERITAS Cluster Server agents.

Disaster Recovery Using VVR and Global Cluster Option

This course covers cluster configurations across remote sites, including Replicated Data Clusters (RDCs) and the Global Cluster Option for wide-area clusters.

Learning Path

VERITAS Cluster Server,

Implementing Local Clusters

Disaster Recovery Using VVR and Global

Cluster Option

High AvailabilityDesign Using

VERITAS Cluster Server

VERITAS Cluster Server, Fundamentals

VERITAS Cluster Server Curriculum

VERITAS Cluster Server Agent

Development


6

SummaryThis lesson described how to verify that the deployment of your high availability environment meets your design criteria.

Additional Resources• VERITAS Cluster Server User’s Guide

This guide provides detailed information on procedures and concepts for configuring and managing VCS clusters.

• http://www.veritas.com/productsFrom the Products link on the VERITAS Web site, you can find information about all high availability and disaster recovery solutions offered by VERITAS.

Lesson SummaryKey Points – Follow best-practice guidelines when

implementing VCS.– You can extend your cluster to provide a range

of disaster recovery solutions.Reference Materials– VERITAS Cluster Server User's Guide– http://www.veritas.com/products


Appendix ALab Synopses

A–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Lab 1 Synopsis: Reconfiguring Cluster MembershipIn this lab, work with your partner to prepare the systems for installing VCS.

Step-by-step instructions for this lab are located on the following page:• “Lab 1 Details: Reconfiguring Cluster Membership,” page B-3

Solutions for this exercise are located on the following page:• “Lab Solution 1: Reconfiguring Cluster Membership,” page C-3

Lab AssignmentsFill in the table with the applicable values for your lab cluster.

Sample Value Your Value

Node names, cluster name, and cluster ID of the two-node cluster from which a system will be removed

train1 train2 vcs1 1

Node names, cluster name, and cluster ID of the two-node cluster to which a system will be added


Node names, cluster name, and cluster ID of the final four-node cluster

train1 train2 train3 train4 vcs2 2


BA A

B BA

DC C

DC

C DCD

B

B

C DD

1 2

3 4 3 4

4

2

2

2

1

1 3DC

BBC

D

AA

Task 1

Task 2

Task 3

D





Appendix A Lab Synopses A–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

A

1 Work with your lab partners to fill in the design worksheet with values appropriate for your cluster.

2 Using this information and the procedure described in the lesson, remove the appropriate cluster system.



Cluster name of the two-node cluster from which a system will be removed

vcs1

Name of the system to be removed

train2

Name of the system to remain in the cluster

train1

Cluster interconnect configuration

train1: qfe0 qfe1train2: qfe0 qfe1Low-priority link:

train1: eri0train2: eri0

Names of the service groups configured in the cluster

name1SG1, name1SG2, name2SG1, name2SG2, NetworkSG, ClusterService

Any localized resource attributes in the cluster

BA A

B BA

1 22

1

Task 1



2 Using this information and the procedure described in the lesson, add the previously removed system to the second cluster.

Task 2: Adding a System to a Running VCS Cluster


Cluster name of the two-node cluster to which a system will be added

vcs2

Name of the system to be added

train2

Names of systems already in cluster

train3 train4

Cluster interconnect configuration for the three-node cluster

train2: qfe0 qfe1train3: qfe0 qfe1train4: qfe0 qfe1Low-priority link:

train2: eri0train3: eri0train4: eri0

Names of service groups configured in the cluster



DC C

DC

C DCD

3 4 3 4

2

2

Task 2

D


A


2 Using the following information and the procedure described in the lesson, merge the one-node cluster and the three-node cluster.

Task 3: Merging Two Running VCS Clusters

BA

CC D

CD

B

B

C DD42

1

1 3DC

BBC

D

A

Task 3

D

AC A



Node name, cluster name, and ID of the small cluster (the one-node cluster that will be merged to the three-node cluster)

train1vcs11

Node name, cluster name, and ID of the large cluster (the three-node cluster that remains running all through the merging process)

train2 train3 train4 vcs22

Names of service groups configured in the small cluster


Names of service groups configured in the large cluster


Names of service groups configured in the merged four-node cluster

name1SG1, name1SG2, name2SG1, name2SG2, name3SG1, name3SG2, name4SG1, name4SG2, NetworkSG, ClusterService

Cluster interconnect configuration for the four-node cluster

train1: qfe0 qfe1train2: qfe0 qfe1train3: qfe0 qfe1train4: qfe0 qfe1Low-priority link:

train1: eri0train2: eri0train3: eri0train4: eri0

Any localized resource attributes in the small cluster

Any localized resource attributes in the large cluster


A

Lab 2 Synopsis: Service Group DependenciesStudents work separately to configure and test service group dependencies.

Step-by-step instructions for this lab are located on the following page:• “Lab 2 Details: Service Group Dependencies,” page B-17

Solutions for this exercise are located on the following page:• “Lab 2 Solution: Service Group Dependencies,” page C-25

If you already have a nameSG2 service group, skip this section.

1 Verify that nameSG1 is online on your local system.

Preparing Service Groups


ParentParent

ChildChild

OnlineLocal

OnlineLocal


OfflineLocal

OfflineLocal

nameSG2

nameSG1




2 Create a service group using the values for your cluster.

3 Copy the loopy script to the / directory on both systems that were in the original two-node cluster.

4 Create a nameProcess2 resource using the appropriate values in your worksheet and bring the resource online.

5 Save and close the cluster configuration.

Service Group Definition Sample Value Your Value

Group nameSG2

Required Attributes

FailOverPolicy Priority

SystemList train1=0 train2=1

Optional Attributes

AutoStartList train1

Resource Definition Sample Value Your Value

Service Group nameSG2

Resource Name nameProcess2

Resource Type Process

Required Attributes

PathName /bin/sh

Arguments /loopy name 2

Critical? No (0)

Enabled? Yes (1)


A1 Take the nameSG1 and nameSG2 service groups offline and delete the two nameSGx service groups added in Lab 1 from SystemList for both groups.

Note: Skip this step if you did not complete the “Combining Clusters” lab.

2 Create an online local firm dependency between nameSG1 and nameSG2 with nameSG1 as the child group.

3 Bring both service groups online on your system. Describe what happens in each of these cases.

a Attempt to switch both service groups to any other system in the cluster.

b Stop the loopy process for nameSG1 on your_sys. Watch the service groups in the GUI closely and record how nameSG2 reacts.

c Stop the loopy process for nameSG1 on their_sys. Watch the service groups in the GUI closely and record how nameSG2 reacts.

4 Clear any faulted resources and verify that both service groups are offline.

5 Remove the dependency between the service groups.

1 Create an online local soft dependency between the nameSG1 and nameSG2 service groups with nameSG1 as the child group.



Testing Online Local Firm

Testing Online Local Soft




3 Describe the differences you observed between the online local firm and online local soft service group dependencies.

4 Clear any faulted resources.

5 Verify that the nameSG1 and nameSG2 service groups are offline.

6 Bring the nameSG1 and nameSG2 service groups online on your system.

7 Kill the loopy process for nameSG2. Watch the service groups in the GUI closely and record how nameSG1 reacts.



Note: Skip this section if you are using a version of VCS earlier than 4.0. Hard dependencies are only supported in VCS 4.0 and later versions.

1 Create an online local hard dependency between the nameSG1 and nameSG2 service groups with nameSG1 as the child group.




Testing Online Local Hard


A


3 Describe the differences you observed between the online local firm/soft and online local hard service group dependencies?



1 Create an online global firm dependency between nameSG2 and nameSG1 with nameSG1 as the child group.







1 Create an online global soft dependency between the nameSG2 and nameSG1 service groups with nameSG1 as the child group.

Testing Online Global Firm Dependencies

Testing Online Global Soft Dependencies






3 Describe the differences you observed between the online global firm and online local soft service group dependencies.



1 Create a service group dependency between nameSG1 and nameSG2 such that, if the nameSG1 fails over to the same system running nameSG2, nameSG2 is shut down. There is no dependency that requires nameSG2 to be running for nameSG1 or nameSG1 to be running for nameSG2.

2 Bring the service groups online on different systems.

3 Stop the loopy process for nameSG2 by sending a kill signal. Record what happens to the service groups.

4 Clear the faulted resource and restart the service groups on different systems.

5 Stop the loopy process for nameSG1 on their_sys. Record what happens to the service groups.


Testing Offline Local Dependency


A


8 When all lab participants have completed the lab exercise, save and close the cluster configuration.

Implement the behavior of an offline local dependency using the FileOnOff and ElifNone resource types to detect when the service groups are running on the same system.

Hint: Set MonitorInterval and the OfflineMonitorInterval for the ElifNone resource type to 5 seconds.

Remove these resources after the test.

Optional Lab: Using FileOnOff and ElifNone


Lab 3 Synopsis: Testing Workload ManagementIn this lab, work with your lab partner to install VCS on both systems.

Step-by-step instructions for this lab are located on the following page:• “Lab 3 Details: Testing Workload Management,” page B-29

Solutions for this exercise are located on the following page:• “Lab 3 Solution: Testing Workload Management,” page C-45

1 Add /opt/VRTScssim/bin to your PATH environment variable after any /opt/VRTSvcs/bin entries, if it is not already present.

2 Set VCS_SIMULATOR_HOME to /opt/VRTScssim, if it is not already set.

3 Use the Simulator GUI to add a cluster using these values:– Cluster Name: wlm – System Name: S1– Port: 15560– Platform: Solaris– WAC Port: -1

Preparing the Simulator Environment



Copy to:___________________________________________


Copy to:___________________________________________




A

4 Copy the main.cf.SGWM.lab file provided by your instructor to a file named main.cf in the simulation configuration directory.

Source location of the main.cf.SGWM.lab file:

___________________________________________cf_files_dir

5 From the Simulator GUI, start the wlm cluster and launch the VCS Java Console for the wlm simulated cluster.

6 Log in as admin with password password.

Notice the cluster name is now VCS. This is the cluster name specified in the new main.cf file you copied into the config directory.

7 Verify that the configuration matches the description shown in the table.

8 In the terminal window you opened previously, set the VCS_SIM_PORT environment variable to 15560.

Note: Use this terminal window for all subsequent commands.

Service Group SystemList AutoStartList

A1 S1 1 S2 2 S3 3 S4 4 S1

A2 S1 1 S2 2 S3 3 S4 4 S1

B1 S1 4 S2 1 S3 2 S4 3 S2

B2 S1 4 S2 1 S3 2 S4 3 S2

C1 S1 3 S2 4 S3 1 S4 2 S3

C2 S1 3 S2 4 S3 1 S4 2 S3

D1 S1 2 S2 3 S3 4 S4 1 S4

D2 S1 2 S2 3 S3 4 S4 1 S4


1 Verify that the failover policy of all service groups is Priority.

2 Verify that all service groups are online on these systems:

3 If the A1 service group faults, where should it fail over? Verify the failover by faulting a critical resource in the A1 service group.

4 If A1 faults again, without clearing the previous fault, where should it fail over? Verify the failover by faulting a critical resource in A1.

5 Clear the existing faults in A1. Then, fault a critical resource in A1. Where should the service group fail to now?

6 Clear the existing fault in the A1 service group.

Testing Priority Failover Policy

System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2


A1 Set the failover policy to load for the eight service groups.

2 Set the Load attribute for each service group based on the following chart.

3 Set S1 and S2 Capacity to 200. Set S3 and S4 Capacity to 100. (This is the default value.)

4 The current status of online service groups should look like this:

5 If A1 faults, where should if fail over? Fault a critical resource in A1 to observe.

Load Failover Policy

Group Load

A1 75

A2 75

B1 75

B2 75

C1 50

C2 50

D1 50

D2 50

System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0



7 If the S2 system fails, where should those service groups fail over? Select the S2 system in Cluster Manager and power it off.


9 Power up the S2 system in the Simulator, clear all faults, and return the service groups to their startup locations.


System S1 S2 S3 S4

Groups B1 C1 D1

A2 B2 C2 D2

A1

AvailableCapacity

125 -25 0 0

System S1 S2 S3 S4

Groups B1 C1 D1

B2 C2 D2

A2 A1

AvailableCapacity

-25 200 -75 0

System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0


A

Leave the load settings as they are but use the Prerequisites and Limits so no more than three service groups of A1, A2, B1, or B2 can run on a system at any one time.

1 Set Limit for each system to ABGroup 3.

2 Set Prerequisites for the A1, A2, B1, and B2 service groups to be 1 ABGroup.

3 Power off S1 in the Simulator. Where do the A1 and A2 service groups fail over?

4 Power off S2 in the Simulator. Where do the A1, A2, B1, and B2 service groups fail over?

5 Power off S3 in the Simulator. Where do the A1, A2, B1, B2, C1, and C2 service groups fail over?

6 Close the configuration, log off from the GUI, and stop the wlm cluster.

Prerequisites and Limits


Lab 4 Synopsis: Configuring Multiple Network InterfacesThis lab uses the VERITAS Cluster Server 4.0 Simulator and the VCS 4.0 Cluster Manager GUI. You are provided a preconfigured main.cf file to learn about managing the cluster.

Step-by-step instructions for this lab are located on the following page:• “Lab 4 Details: Configuring Multiple Network Interfaces,” page B-37

Solutions for this exercise are located on the following page:• “Lab 4 Solution: Configuring Multiple Network Interfaces,” page C-63

Solaris

Students work together initially to modify the NetworkSG service group to replace the NIC resource with a MultiNICB resource. Then, students work separately to modify their own nameSG1 service group to replace the IP type resource with an IPMultiNICB resource.

Mobile

The mobile equipment in your classroom may not support this lab exercise.AIX, HP-UX, Linux

Skip to the MultiNICA and IPMultiNICA section. Here, students work together initially to modify the NetworkSG service group to replace the NIC resource with a MultiNICA resource. Then, students work separately to modify their own service group to replace the IP type resource with an IPMultiNIC resource.

Virtual Academy

Skip this lab if you are working in the Virtual Academy.


nameProcess2

AppVol

AppDG

nameProxy2

nameIP2

nameDG2

nameVol2

nameMount2

nameProcess1

nameDG1

nameVol1

nameMount1

nameProxy1

nameIPM1

NetworkMNIC

NetworkPhantom


NetworkSGNetworkSG

NetworkNIC


A

Network Cabling—All Platforms

Note: The MultiNICB lab requires another IP on the 10.x.x.x network to be present outside of the cluster. Normally, other students’ clusters will suffice for this requirement. However, if there are no other clusters with the 10.x.x.x network defined yet, the trainer system can be used.

Your instructor can bring up a virtual IP of 10.10.10.1 on the public network interface on the trainer system, or another classroom system.

1 Verify the cabling or recable the network according to the previous diagram.

2 Set up base IP addresses for the interfaces used by the MultiNICB resource.

Preparing Networking

Sys A Sys B Sys C Sys D

Crossover (1)Private network when 4 node cluster (8)

Counts for 4 node clusters

Public network when 4 node cluster (4)

Classroom network

MultiNIC/VVR/GCO (8)

Private nets

Public Net

0123 0123 0123 01230 0 0 0


a Set up the /etc/hosts file on each system to have an entry for each interface on each system using the following address scheme where W, X, Y, and Z are system numbers.

b Set up /etc/hostname.interface files on all systems to enable these IP addresses to be started at boot time. Use the following syntax:

c Check the local-mac-address? eeprom setting; ensure that it is set to true on each system. If not, change this setting to true.

d Reboot all systems for the addresses and the eeprom setting to take effect. Do this is such a way to keep the services highly available.

/etc/hosts

10.10.W.2 trainW_qfe2


10.10.X.2 trainX_qfe2


10.10.Y.2 trainY_qfe2


10.10.Z.2 trainZ_qfe2



A

Working with your lab partner, use the values in the table to configure a MultiNICB resource to the NetworkSG service group.

Optional mpathd Configuration

You may configure MultiNICB to use mpathd mode as shown in the following steps.

1 Obtain the IP addresses for the /etc/defaultrouter file from you instructor.

__________________________ __________________________

2 Modify the /etc/defaultrouter on each system substituting the IP addresses provided within LINE1 and LINE2.

LINE1: route add host 192.168.xx.x -reject 127.0.0.1LINE2: route add default 192.168.xx.1

3 Set TRACK_INTERFACES_ONLY_WITH_GROUP to yes in /etc/default/mpathd.

4 Set the UseMpathd attribute for NetworkNMICB to 1 and set the MpathdCommand attribute to /sbin/in.mpathd -a.

Configuring MultiNICB


Service Group NetworkSG

Resource Name NetworkMNICB

Resource Type MultiNICB

Required Attributes

Device qfe2qfe3

Critical? No (0)

Enabled? Yes (1)


In this portion of the lab, work separately to modify the Proxy resource in your nameSG1 service group to reference the MultiNICB resource.

Reconfiguring Proxy



Resource Name nameProxy1

Resource Type Proxy

Required Attributes

TargetResName NetworkMNICB

Critical? No (0)

Enabled? Yes (1)


A

Create an IPMultiNICB resource in the nameSG1 service group.

Configuring IPMultiNICB



Resource Name nameIPMNICB1

Resource Type IPMultiNICB

Required Attributes

BaseResName NetworkMNICB

Netmask 255.255.255.0

Address See the table that follows.

Critical? No (0)

Enabled? Yes (1)

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train5 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62


1 Link the nameIPMNICB1 resource to the nameProxy1 resource.

2 Switch the nameSG1 service group between the systems to test its resources on each system. Verify that the IP address specified in the nameIPMNICB1 resource switches with the service group.

3 Set the new resource to critical (nameIPMNICB1).

4 Save the cluster configuration.

Note: Wait for all participants to complete the steps to this point. Then, test the NetworkMNICB resource by performing the following procedure.

Each student can take turns to test their resource, or all can observe one test.

1 Determine which interface the nameIPMNICB1 resource is using on the system where it is currently online.

2 Unplug the network cable from that interface.

What happens to the nameIPMNICB1 IP address?

3 Determine the status of the interface with the unplugged cable.

4 Leave the network cable unplugged. Unplug the other interface that the NetworkMNICB resource is now using.

What happens to the NetworkMNICB resource and the nameSG1 service group?

Linking and Testing IPMultiNICB

Testing IPMultiNICB Failover


A

5 Replace the cables.

What happens?

6 Clear the nameIPMNICB1 resource if it is faulted.

7 Save and close the configuration.

Note: Only complete this lab if you are working on an AIX, HP-UX, or Linux system in your classroom.

Work together using the values in the table to create a MultiNICA resource.

Alternate Lab: Configuring MultiNICA and IPMultiNIC



Resource Name NetworkMNICA

Resource Type MultiNICA

Required Attributes

Device(See the table that follows for admin IPs.)

AIX: en3, en4HP-UX: lan3, lan4Linux: eth3, eth4

NetworkHosts (HP-UX only)

192.168.xx.xxx (See the instructor.)

NetMask (AIX, Linux only)

255.255.255.0

Critical? No (0)

Enabled? Yes (1)



2 Set up the /etc/hosts file on each system to have an entry for each interface on each system in the cluster using the following address scheme where 1, 2, 3, and 4 are system numbers.

/etc/hosts10.10.10.101 train1_mnica10.10.10.102 train2_ mnica10.10.10.103 train3_ mnica10.10.10.104 train4_ mnica

System Admin IP Address

train1 10.10.10.101

train2 10.10.10.102

train3 10.10.10.103

train4 10.10.10.104

train4 10.10.10.105

train6 10.10.10.106

train7 10.10.10.107

train8 10.10.10.108

train9 10.10.10.109

train10 10.10.10.110

train11 10.10.10.111

train12 10.10.10.112


A

3 Working together. add the NetworkMNICA resource to the NetworkSG service group.






Required Attributes






255.255.255.0

Critical? No (0)

Enabled? Yes (1)


train1 10.10.10.101

train2 10.10.10.102

train3 10.10.10.103

train4 10.10.10.104

train4 10.10.10.105

train6 10.10.10.106

train7 10.10.10.107

train8 10.10.10.108

train9 10.10.10.109

train10 10.10.10.110

train11 10.10.10.111

train12 10.10.10.112


In this portion of the lab, modify the Proxy resource in the nameSG1 service group to reference the MultiNICA resource and remove the IP resource.

Reconfiguring Proxy




Resource Type Proxy

Required Attributes

TargetResName NetworkMNICA

Critical? No (0)

Enabled? Yes (1)


A

Each student works separately to create an IPMultiNIC resource in their own nameSG1 service group using the values in the table.

Configuring IPMultiNIC


Service Group nameSG1Resource Name nameIPMNIC1

Resource Type IPMultiNIC

Required Attributes

MultiNICResName NetworkMNICA


NetMask (HP-UX, Linux only)

255.255.255.0

Critical? No (0)

Enabled? Yes (1)

System Address

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train4 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62


1 Link the nameIPMNIC1 resource to the nameProxy1 resource.

2 If present, link the nameProcess1 or nameApp1 resource to nameIPMNIC1.

3 Switch the nameSG1 service group between the systems to test its resources on each system. Verify that the IP address specified in the nameIPMNIC1 resource switches with the service group.

4 Set the new resource to critical (nameIPMNIC1).


Linking IPMultiNIC


ANote: Wait for all participants to complete the steps to this point. Then, test the NetworkMNICA resource by performing the following procedure.


1 Determine which interface the nameIPMNIC1 resource is using on the system where it is currently online.


What happens to the nameIPMNIC1 IP address?

3 Determine the status of the interface with the unplugged cable.

4 Leave the network cable unplugged. Unplug the other interface that the NetworkMNICA resource is now using.

What happens to the NetworkMNICA resource and the nameSG1 service group?


What happens?

6 Clear the nameIPMNIC1 resource if it is faulted.


Testing IPMultiNIC Failover


Appendix BLab Details

B–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Lab 1 Details: Reconfiguring Cluster Membership B–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

B

Lab 1 Details: Reconfiguring Cluster Membership


Lab 1 Details: Reconfiguring Cluster MembershipStudents work together to create four-node clusters by combining two-node clusters.

Brief instructions for this lab are located on the following page:• “Lab 1 Synopsis: Reconfiguring Cluster Membership,” page A-2

Solutions for this exercise are located on the following page:• “Lab Solution 1: Reconfiguring Cluster Membership,” page C-3


BA A

B BA

DC C

DC

C DCD

B

B

C DD

1 2

3 4 3 4

4

2

2

2

1

1 3DC

BBC

D

AA

Task 1

Task 2

Task 3

D






B










Fill in the design worksheet with values appropriate for your cluster and use the information to remove a system from a running VCS cluster.




vcs1

Name of system to be removed

train2

Name of system to remain in the cluster

train1







BA A

B BA

1 22

1

Task 1


B

1 Prevent application failover to the system to be removed.

2 Switch any application services that are running on the system to be removed to any other system in the cluster.

Note: This step can be combined with either step 1 or step 3 as an option to a single command line.

3 Stop VCS on the system to be removed.

4 Remove any disk heartbeat configuration on the system to be removed.

Note: No disk heartbeats are configured in the classroom. This step is included as a reminder in the event you use this lab in a real-world environment.

5 Stop VCS communication modules (GAB and LLT) and I/O fencing on the system to be removed.

Note: On Solaris platform, you also need to unload the kernel modules.


7 Remove VCS software from the system taken out of the cluster.

Note: For purposes of this lab, you do not need to remove the software because this system is put back in the cluster later. This step is included in case you use this lab as a guide to removing a system from a cluster in a real-world environment.

8 Update service group and resource configurations that refer to the system that is removed.

Note: Service group attributes, such as AutoStartList, SystemList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.



11 Modify the VCS communication configuration files on the remaining systems in the cluster to reflect the change.


– Edit /etc/llthosts on all the systems remaining in the cluster (train1 in this example) to remove the line corresponding to the removed system (train2 in this example).


Note: You do not need to stop and restart LLT and GAB on the remaining systems when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag addressFor more information on these directives, see the VCS manual pages on llttab.


B

Fill in the design worksheet with values appropriate for your cluster and use the information to add a system to a running VCS cluster.




vcs2

Name of system to be added

train2


train3 train4







DC C

DC

C DCD

3 4 3 4

2

2

Task 2

D


1 Install any necessary application software on the new system.

Note: In the classroom, you do not need to install any other set of application binaries on your system for this lab.

2 Configure any application resources necessary to support clustered applications on the new system.

Note: The new system should be capable of running the application services in the cluster it is about to join. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access– Checking NFS major and minor numbers

Note: For this lab, you only need to create the necessary mount points on all the systems for the shared file systems used in the running VCS clusters (vcs2 in this example).

3 Physically cable cluster interconnect links.

Note: If the original cluster is a two-node cluster with crossover cables for cluster interconnect, you need to change to hubs or switches before you can add another node. Ensure that the cluster interconnect is not completely disconnected while you are carrying out the changes.

4 Install VCS on the new system. If you skipped the removal step in the previous section as recommended, you do not need to install VCS on this system.

Notes: – You can either use the installvcs script with the -installonly




the /opt/VRTS/bin/vxlicinst -k command.


B

a Record the location of the installation software provided by your instructor.

Installation software location:

____________________________________________________

b Start the installation.

c Specify the name of the new system to the script (train2 in this example).

5 Configure VCS communication modules (GAB, LLT) on the added system.

Note: You must complete this step even if you did not remove and reinstall the VCS software.

6 Configure fencing on the new system, if used in the cluster.

7 Update VCS communication configuration (GAB, LLT) on the existing systems.

Note: You do not need to stop and restart LLT and GAB on the existing systems in the cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag addressFor more information on these directives, check the VCS manual pages on llttab.

8 Install any VCS Enterprise agents required on the new system.

Notes: – No agents are required to be installed for this lab exercise.– Enterprise agents should only be installed, not configured.


9 Copy any triggers, custom agents, scripts, and so on from existing cluster systems to the new cluster system.

Note: In an earlier lab, you may have configured resfault, nofailover, resadminwait, and injeopardy triggers on all the systems in each cluster. Because the trigger scripts are the same in every cluster, you do not need to modify the existing scripts. However, ensure that all the systems have the same trigger scripts.

If you reinstalled the new system, copy triggers to the system.

10 Start cluster services on the new system and verify cluster membership.

11 Update service group and resource configuration to use the new system.

Note: Service group attributes, such as SystemList, AutoStartList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.



B

Fill in the design worksheet with values appropriate for your cluster and use the information to merge two running VCS clusters.




train1vcs11







BA

CC D

CD

B

B

C DD42

1

1 3DC

BBC

D

A

Task 3

D

AC A


In the following steps, it is assumed that the small cluster is merged to the large cluster; that is, the merged cluster keeps the name and ID of the large cluster, and the large cluster is not brought down during the whole process.

1 Modify VCS communication files on the large cluster to recognize the systems to be added from the small cluster.

Note: You do not need to stop and restart LLT and GAB on the existing systems in the large cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag addressFor more information on these directives, check the VCS manual pages on llttab.

2 Add the names of the systems in the small cluster to the large cluster.










B

3 Install any additional application software required to support the merged configuration on all systems.

Note: You are not required to install any additional software for the classroom exercise. This step is included to aid you if you are using this lab as a guide in a real-world environment.

4 Configure any additional application software required to support the merged configuration on all systems.

All the systems should be capable of running the application services when the clusters are merged. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access

Note: For this lab, you only need to create the necessary mount points on all the systems for the shared file systems used in both VCS clusters (both vcs1 and vcs2 in this example).

5 Install any additional VCS Enterprise agents on each system.


6 Copy any additional custom agents to all systems.

Notes: – No custom agents are required to be copied for this lab exercise.– Custom agents should only be installed, not configured.

7 Extract the service group configuration from the small cluster and add it to the large cluster configuration.

8 Copy or merge any existing trigger scripts on all systems.



9 Stop cluster services (VCS, fencing, GAB, LLT) on the systems in the small cluster.

Note: Leave application services running on the systems.

10 Reconfigure VCS communication modules on the systems in the small cluster and physically connect the cluster interconnect links.

11 Start cluster services (LLT, GAB, fencing, VCS) on the systems in the small cluster and verify cluster memberships.

12 Update service group and resource configuration to use all the systems.

Note: Service group attributes, such as SystemList, AutoStartList, and SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.


Lab 2 Details: Service Group Dependencies B–17Copyright © 2005 VERITAS Software Corporation. All rights reserved.

B

Lab 2 Details: Service Group Dependencies


Lab 2 Details: Service Group DependenciesStudents work separately to configure and test service group dependencies.

Brief instructions for this lab are located on the following page:• “Lab 2 Synopsis: Service Group Dependencies,” page A-7

Solutions for this exercise are located on the following page:• “Lab 2 Solution: Service Group Dependencies,” page C-25


ParentParent

ChildChild

OnlineLocal

OnlineLocal


OfflineLocal

OfflineLocal

nameSG2

nameSG1




B

If you already have both a nameSG1 and nameSG2 service group, skip this section.



3 Record the values for your service group in the worksheet.

4 Open the cluster configuration.

5 Create the service group using either the GUI or CLI.

6 Modify the SystemList attribute to add the original two systems in your cluster.

7 Modify the AutoStartList attribute to allow the service group to start on your system.

8 Verify that the service group can autostart and that it is a failover service group.

9 Save and close the cluster configuration and view the configuration file to verify your changes.

Note: In the GUI, the Close configuration action also saves the configuration.



Group nameSG2

Required Attributes



Optional Attributes



10 Create a nameProcess2 resource using the appropriate values in your worksheet.

11 Set the resource to not critical.

12 Set the required attributes for this resource, and any optional attributes, if needed.

13 Enable the resource.

14 Bring the resource online on your system.

15 Verify that the resource is online in VCS and at the operating system level.






Required Attributes

PathName /bin/sh

Optional Attributes

Arguments /name2/loopy name 2

Critical? No (0)

Enabled? Yes (1)


B

1 Take the nameSG1 and nameSG2 service groups offline.


3 Delete the systems added in Lab 1 from the SystemList attribute for your two nameSGx service groups.



5 Bring both service groups online on your system.

6 After the service groups are online, attempt to switch both service groups to any other system in the cluster.

What do you see?

7 Stop the loopy process for nameSG1 on your_sys by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

8 Stop the loopy process for nameSG1 on their_sys by sending a kill signal on that system. Watch the service groups in the GUI closely and record how nameSG2 reacts.









What do you see?

4 Stop the loopy process for nameSG1 by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

5 Stop the loopy process for nameSG1 on their system by sending a kill signal. Watch the service groups in the GUI closely and record how the nameSG2 service group reacts.

6 Describe the differences you observe between the online local firm and online local soft service group dependencies.








B






2 Bring both groups online on your system, if they are not already online.


What do you see?


5 Stop the loopy process for nameSG2 on their system by sending the kill signal. Watch the service groups in the GUI and record how nameSG1 reacts.

6 Which differences were observed between the online local firm/soft and online local hard service group dependencies?






B

1 Create an online global firm dependency between nameSG2 and nameSG1, with nameSG1 as the child group.


3 After the service groups are online, attempt to switch either service group to any other system in the cluster.

What do you see?

4 Stop the loopy process for the nameSG1 by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

5 Stop the loopy process for nameSG1 on their system by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.


7 Verify that both service groups are offline.






3 After the service groups are online, attempt to switch either service group to their system.

What do you see?

4 Switch the service group to your system.

5 Stop the loopy process for nameSG1 by sending the kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

6 Stop the loopy process for nameSG1 on their system by sending the kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

7 What differences were observed between the online global firm and online local soft service group dependencies?






B

1 Create a service group dependency between nameSG1 and nameSG2 such that, if the nameSG1 fails over to the same system running nameSG2, nameSG2 is shut down. There is no dependency that requires nameSG2 to be running for nameSG1 or nameSG1 to be running for nameSG2.


3 Stop the loopy process for the nameSG2 by sending a kill signal. Record what happens to the service groups.


5 Stop the loopy process for nameSG1 on their_sys by sending the kill signal. Record what happens to the service groups.











Lab 3 Details: Testing Workload Management B–29Copyright © 2005 VERITAS Software Corporation. All rights reserved.

B

Lab 3 Details: Testing Workload Management


Lab 3 Details: Testing Workload ManagementStudents work separately to configure and test workload management using the simulator.

Brief instructions for this lab are located on the following page:• “Lab 3 Synopsis: Testing Workload Management,” page A-14

Solutions for this exercise are located on the following page:• “Lab 3 Solution: Testing Workload Management,” page C-45



Copy to:___________________________________________


Copy to:___________________________________________




B



3 Start the Simulator GUI.

4 Add a cluster.

5 Use these values to define the new simulated cluster:– Cluster Name: wlm – System Name: S1– Port: 15560– Platform: Solaris– WAC Port: -1

6 In a terminal window, change to the simulator configuration directory for the new simulated cluster named wlm.


Source location of main.cf.SGWM.lab file:

___________________________________________cf_files_dir

8 From the Simulator GUI, start the wlm cluster.

9 Launch the VCS Java Console for the wlm simulated cluster.




11 Notice the cluster name is now VCS. This is the cluster name specified in the new main.cf file you copied into the config directory.

12 Verify that the configuration matches the description shown in the table.There should be eight failover service groups and the ClusterService group running on four systems in the cluster. Two service groups should be running on each system (as per the AutoStartList attribute). Verify your configuration against this chart:




A1 S1 1 S2 2 S3 3 S4 4 S1

A2 S1 1 S2 2 S3 3 S4 4 S1

B1 S1 4 S2 1 S3 2 S4 3 S2

B2 S1 4 S2 1 S3 2 S4 3 S2

C1 S1 3 S2 4 S3 1 S4 2 S3

C2 S1 3 S2 4 S3 1 S4 2 S3

D1 S1 2 S2 3 S3 4 S4 1 S4

D2 S1 2 S2 3 S3 4 S4 1 S4


B




4 If A1 faults again, without clearing the previous fault, where should it fail over? Verify the failover by faulting a critical resource in the A1 service group.

5 Clear the existing faults in the A1 service group. Then, fault a critical resource in the A1 service group. Where should the service group fail to now?



System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2


1 Set the failover policy to load for the eight service groups.


3 Set S1 and S2 Capacity to 200. Set S3 and S4 Capacity to 100. (This is the default value.)


5 If the A1 service group faults, where should if fail over? Fault a critical resource in A1.


Group Load

A1 75

A2 75

B1 75

B2 75

C1 50

C2 50

D1 50

D2 50

System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0


B






System S1 S2 S3 S4

Groups B1 C1 D1

A2 B2 C2 D2

A1

AvailableCapacity

125 -25 0 0

System S1 S2 S3 S4

Groups B1 C1 D1

B2 C2 D2

A2 A1

AvailableCapacity

-25 200 -75 0

System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0


Leave the load settings, but use the Prerequisites and Limits so no more than three service groups of A1, A2, B1, or B2 can run on a system at any one time.

1 Set Limit for each system to ABGroup 3.

2 Set Prerequisites for service groups A1, A2, B1, and B2 to be 1 ABGroup.





7 Log off from the Cluster Manager.

8 Stop the wlm cluster.


Lab 4 Details: Configuring Multiple Network Interfaces B–37Copyright © 2005 VERITAS Software Corporation. All rights reserved.

B

Lab 4 Details: Configuring Multiple Network Interfaces


Lab 4 Details: Configuring Multiple Network InterfacesThe purpose of this lab is to replace the NIC and IP resources with their MultiNIC counterparts. Students work together in some portions of this lab and separately in others.

Brief instructions for this lab are located on the following page:• “Lab 4 Synopsis: Configuring Multiple Network Interfaces,” page A-20

Solutions for this exercise are located on the following page:• “Lab 4 Solution: Configuring Multiple Network Interfaces,” page C-63

Solaris


Mobile



Virtual Academy



nameProcess2

AppVol

AppDG

nameProxy2

nameIP2

nameDG2

nameVol2

nameMount2

nameProcess1

nameDG1

nameVol1

nameMount1

nameProxy1

nameIPM1

NetworkMNIC

NetworkPhantom


NetworkSGNetworkSG

NetworkNIC


B








Classroom network


Private nets

Public Net

0123 0123 0123 01230 0 0 0





The following example shows you how the /etc/hosts file looks for the cluster containing systems train11, train12, train13, and train14.


/etc/hosts









/etc/hosts

10.10.11.2 train11_qfe2

10.10.11.3 train11_qfe3

10.10.12.2 train12_qfe2

10.10.12.3 train12_qfe3

10.10.13.2 train13_qfe2

10.10.13.3 train13_qfe3

10.10.14.2 train14_qfe2

10.10.14.3 train14_qfe3


B


/etc/hostname.qfe2trainX_qfe2 netmask + broadcast + deprecated -failover up


c Check the local-mac-address? eeprom setting; ensure that it is set to true on each system. If not, change this setting to true.



Use the values in the table to configure a MultiNICB resource.


2 Add the resource to the NetworkSG service group.


4 Set the required attributes for this resource, and any optional attributes if needed.



7 Set the resource to critical.

8 Save the cluster configuration and view the configuration file to verify your changes.






Required Attributes

Device qfe2qfe3

Critical? No (0)

Enabled? Yes (1)


B


9 You may configure MultiNICB to use mpathd mode as shown in the following steps.

a Obtain the IP addresses for the /etc/defaultrouter file from you instructor.

__________________________ __________________________

b Modify the /etc/defaultrouter on each system substituting the IP addresses provided within LINE1 and LINE2.


c Set TRACK_INTERFACES_ONLY_WITH_GROUP to yes in /etc/default/mpathd.

d Set the UseMpathd attribute for NetworkNMICB to 1.

e Set the MpathdCommand attribute to /sbin/in.mpath.

f Save the cluster configuration.



1 Take the nameIP1 resource and all resources above it offline in the nameSG1 service group.

2 Disable the nameProxy1 resource.

3 Edit the nameProxy1 resource and change its target resource name to NetworkMNICB.

4 Enable the nameProxy1 resource.

5 Delete the nameIP1 resource.

Reconfiguring Proxy




Resource Type Proxy

Required Attributes


Critical? No (0)

Enabled? Yes (1)


B







Required Attributes


Netmask 255.255.255.0


Critical? No (0)

Enabled? Yes (1)

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train5 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62


1 Add the resource to the service group.








B







Note: Wait for all participants to complete the steps to this point. Then, test the NetworkMNICB resource by performing the following procedure.





3 Use ifconfig to determine the status of the interface with the unplugged cable.




What happens?





B








Required Attributes






255.255.255.0

Critical? No (0)

Enabled? Yes (1)


train1 10.10.10.101

train2 10.10.10.102

train3 10.10.10.103

train4 10.10.10.104

train4 10.10.10.105

train6 10.10.10.106

train7 10.10.10.107

train8 10.10.10.108


train9 10.10.10.109

train10 10.10.10.110

train11 10.10.10.111

train12 10.10.10.112



B



/etc/hosts10.10.10.101 train1_mnica10.10.10.102 train2_ mnica10.10.10.103 train3_ mnica10.10.10.104 train4_ mnica

3 Verify that NetworkSG is online on both systems.


5 Add the NetworkMNICA resource to the NetworkSG service group.





10 Make the resource critical.



In this portion of the lab, modify the Proxy resource in the nameSG1 service group to reference the MultiNICA resource.



3 Edit the nameProxy1 resource and change its target resource name to NetworkMNICA.



Reconfiguring Proxy




Resource Type Proxy

Required Attributes


Critical? No (0)

Enabled? Yes (1)


B






Required Attributes




255.255.255.0

Critical? No (0)

Enabled? Yes (1)

System Address

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train4 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62










B






Linking IPMultiNIC


Note: Wait for all participants to complete the steps to this point. Then test the NetworkMNICA resource by performing the following procedure.





3 Use ifconfig (or netstat) to determine the status of the interface with the unplugged cable.




What happens?




Appendix CLab Solutions

C–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Lab Solution 1: Reconfiguring Cluster Membership C–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

C

Lab Solution 1: Reconfiguring Cluster Membership


Lab 1 Solution: Combining ClustersStudents work together to create four-node clusters by combining two-node clusters.

Brief instructions for this lab are located on the following page:• “Lab 1 Synopsis: Reconfiguring Cluster Membership,” page A-2

Step-by-step instructions for this lab are located on the following page:• “Lab 1 Details: Reconfiguring Cluster Membership,” page B-3


BA A

B BA

DC C

DC

C DCD

B

B

C DD

1 2

3 4 3 4

4

2

2

2

1

1 3DC

BBC

D

AA

Task 1

Task 2

Task 3

D






C










Fill in the design worksheet with values appropriate for your cluster and use the information to remove a system from a running VCS cluster.




vcs1

Name of the system to be removed

train2

Name of the system to remain in the cluster

train1




Names of the service groups configured in the cluster



BA A

B BA

1 22

1

Task 1


C

1 Prevent application failover to the system to be removed, persisting through VCS restarts.

hasys -freeze -persistent -evacuate train2

2 Switch any application services that are running on the system to be removed to any other system in the cluster.

Note: This step can be combined with either step 1 or step 3 as an option to a single command line.

This step has been combined with step 1.

3 Stop VCS on the system to be removed.

hastop -sys train2

Note: Steps 1-3 can also be accomplished using the following commands:

hasys -freeze train2hastop -sys train2 -evacuate

4 Remove any disk heartbeat configurations on the system to be removed.

Note: No disk heartbeats are configured in the classroom. This step is included as a reminder in the event you use this lab in a real-world environment.

5 Stop VCS communication modules (GAB and LLT) and I/O fencing on the system to be removed.

Note: On the Solaris platform, you also need to unload the kernel modules.

On the system to be removed, train2 in this example:

/etc/init.d/vxfen stop (if fencing is configured)gabconfig -Ulltconfig -U

Solaris Onlymodinfo | grep gabmodunload -i gab_IDmodinfo | grep lltmodunload -i llt_IDmodunload | grep vxfenmodinfo -i fen_ID



7 Remove VCS software from the system taken out of the cluster.

Note: For purposes of this lab, you do not need to remove the software because this system is put back in the cluster later. This step is included in case you use this lab as a guide to removing a system from a cluster in a real-world environment.


On the system remaining in the cluster, train1 in this example:

haconf -makerw

For all service groups that have train2 in their SystemList and AutoStartList attributes:

hagrp -modify groupname AutoStartList –delete train2hagrp -modify groupname SystemList –delete train2


hasys -delete train2




C

11 Modify the VCS communication configuration files on the remaining systems in the cluster to reflect the change.– Edit /etc/llthosts on all the systems remaining in the cluster (train1

in this example) to remove the line corresponding to the removed system (train2 in this example).


Note: You do not need to stop and restart LLT and GAB on the remaining systems when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag addressFor more information on these directives, see the VCS manual pages on llttab.


Fill in the design worksheet with values appropriate for your cluster and use the information to add a system to a running VCS cluster.




vcs2

Name of the system to be added

train2


train3 train4







DC C

DC

C DCD

3 4 3 4

2

2

Task 2

D


C

1 Install any necessary application software on the new system.

Note: In the classroom, you do not need to install any other set of application binaries on your system for this lab.

2 Configure any application resources necessary to support clustered applications on the new system.

Note: The new system should be capable of running the application services in the cluster it is about to join. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access– Checking NFS major and minor numbers

Note: For this lab, you only need to create the necessary mount points on all the systems for the shared file systems used in the running VCS clusters (vcs2 in this example).

Create four new mount points:mkdir /name31mkdir /name32mkdir /name41mkdir /name42

3 Physically cable cluster interconnect links.

Note: If the original cluster is a two-node cluster with crossover cables for the cluster interconnect, you need to change to hubs or switches before you can add another node. Ensure that the cluster interconnect is not completely disconnected while you are carrying out the changes.

4 Install VCS on the new system. If you skipped the removal step in the previous section as recommended, you do not need to install VCS on this system.

Notes: – You can either use the installvcs script with the -installonly





the /opt/VRTS/bin/vxlicinst -k command.

a Record the location of the installation software provided by your instructor.

Installation software location:_______________________________________

b Start the installation.

cd /install_location./installvcs -installonly

c Specify the name of the new system to the script (train2 in this example).

5 Configure VCS communication modules (GAB, LLT) on the added system.

Note: You must complete this step even if you did not remove and reinstall the VCS software.

› /etc/llttabThis file should have the same cluster ID as the other systems in the cluster. This is the /etc/llttab file used in this example configuration:set-cluster 2set-node train2link tag1 /dev/interface1:x - ether - -link tag2 /dev/interface2:x - ether - - link-lowpri tag3 /dev/interface3:x - ether - -

LinuxOn Linux, do not prepend the interface with /dev in the link specification.


C

› /etc/llthostsThis file should contain a unique node number for each system in the cluster, and it should be the same on all systems in the cluster.This is the /etc/llthosts file used in this example configuration:

0 train31 train42 train2

› /etc/gabtab This file should contain the command to start GAB and any configured disk heartbeats.This is the /etc/gabtab file used in this example configuration:/sbin/gabconfig -c -n 3

Note: The seed number used after the -n option shown previously should be equal to the total number of systems in the cluster.

6 Configure fencing on the new system, if used in the cluster.

Create /etc/vxfendg and enter the coordinator disk group name.

7 Update VCS communication configuration (GAB, LLT) on the existing systems.

Note: You do not need to stop and restart LLT and GAB on the existing systems in the cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag address

For more information on these directives, check the VCS manual pages for llttab.


a Edit /etc/llthosts on all the systems in the cluster (train3 and train4 in this example) to add an entry corresponding to the new system (train2 in this example).On train3 and train4:

# vi /etc/llthosts0 train31 train42 train2

b Edit /etc/gabtab on all the systems in the cluster (train3 and train4 in this example) to increase the –n option to gabconfig by 1.On train3 and train4:

# vi /etc/gabtab/sbin/gabconfig -c -n 3

8 Install any VCS Enterprise agents required on the new system.


9 Copy any triggers, custom agents, scripts, and so on from existing cluster systems to the new cluster system.


If you reinstalled the new system, copy triggers to the system.

cd /opt/VRTSvcs/bin/triggersrcp train3:/opt/VRTSvcs/bin/triggers/* .


C

10 Start cluster services on the new system and verify cluster membership.

On train2:lltconfig -cgabconfig -c -n 3gabconfig -a

Port a membership should include the node ID for train2.

/etc/init.d/vxfen starthastartgabconfig -a

Both port a and port h memberships should include the node ID for train2.Note: You can also use LLT, GAB, and VCS startup files installed by the VCS packages to start cluster services.

11 Update service group and resource configuration to use the new system.Note: Service group attributes, such as SystemList, AutoStartList, SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.

haconf -makerw

For all service groups in the vcs2 cluster, modify the SystemList and AutoStartList attributes:

hagrp -modify groupname SystemList –add train2 priorityhagrp -modify groupname AutoStartList –add train2

When you have completed the modifications:haconf -dump -makero


For all service groups in the vcs2 cluster:hagrp -switch groupname -to train2


Fill in the design worksheet with values appropriate for your cluster and use the information to merge two running VCS clusters.


BA

CC D

CD

B

B

C DD42

1

1 3DC

BBC

D

A

Task 3

D

AC A


C



train1vcs11















In the following steps, it is assumed that the small cluster is merged to the large cluster; that is, the merged cluster keeps the name and ID of the large cluster, and the large cluster is not brought down during the whole process.

1 Modify VCS communication files on the large cluster to recognize the systems to be added from the small cluster.

Note: You do not need to stop and restart LLT and GAB on the existing systems in the large cluster when you make changes to the configuration files unless the /etc/llttab file contains the following directives that need to be changed: – include system_ID_range– exclude system_ID_range– set-addr systemID tag addressFor more information on these directives, check the VCS manual pages on llttab.

– Edit /etc/llthosts on all the systems in the large cluster to add entries corresponding to the new systems from the small cluster.On train2, train3, and train4:vi /etc/llthosts0 train41 train32 train23 train1

– Edit /etc/gabtab on all the systems in the large cluster to increase the –n option to gabconfig by the number of systems in the small cluster.On train2, train3, and train4:vi /etc/gabtab/sbin/gabconfig -c -n 4

2 Add the names of the systems in the small cluster to the large cluster.

haconf -makerwhasys -add train1hasys -add train2haconf -dump -makero


C

3 Install any additional application software required to support the merged configuration on all systems.

Note: You are not required to install any additional software for the classroom exercise. This step is included to aid you if you are using this lab as a guide in a real-world environment.

4 Configure any additional application software required to support the merged configuration on all systems.

All the systems should be capable of running the application services when the clusters are merged. Preparing application resources may include:– Creating user accounts– Copying application configuration files– Creating mount points– Verifying shared storage access

Note: For this lab, you only need to create the necessary mount points on all the systems for the shared file systems used in both VCS clusters (both vcs1 and vcs2 in this example).

› On the train1 system, create four new mount points:

mkdir /name31mkdir /name32mkdir /name41mkdir /name42

› On systems train3 and train4, you also need to create four new mount points (train2 should already have these mount points created. If not, you need to create these mount points on train2 as well.):

mkdir /name11mkdir /name12mkdir /name21mkdir /name22

5 Install any additional VCS Enterprise agents on each system.



6 Copy any additional custom agents to all systems.

Notes: – No custom agents are required to be copied for this lab exercise.– Custom agents should only be installed, not configured.

7 Extract the service group configuration from the small cluster and add it to the large cluster configuration.

a On the small cluster, vcs1 in this example, create a main.cmd file.

hacf -cftocmd /etc/VRTSvcs/conf/config

b Edit main.cmd and filter the commands related with service group configuration. Note that you do not need to have the commands related to the ClusterService and NetworkSG service groups because these already exist in the large cluster.

c Copy the filtered main.cmd file to a running system in the large cluster, for example, to train3.

d On the system in the large cluster where you copied the main.cmd file, train3 in vcs2 in this example, open the configuration.

haconf -makerw

e Execute the filtered main.cmd file.

sh main.cmd

Note: There are no customized resource types used in the lab exercises.

8 Copy or merge any existing trigger scripts on all systems.



C

9 Stop cluster services (VCS, fencing, GAB, LLT) on the systems in the small cluster.

Note: Leave application services running on the systems.

a On one system in the small cluster (train1 in vcs1 in this example), stop VCS.

hastop -all -force

b On all the systems in the small cluster (train1 in vcs1 in this example), stop fencing, GAB, and LLT.

/etc/init.d/vxfen stopgabconfig -Ulltconfig -U

10 Reconfigure VCS communication modules on the systems in the small cluster and physically connect the cluster interconnect links.

On all the systems in the small cluster (train1 in vcs1 in this example):

a Edit /etc/llttab and modify the cluster ID to be the same as the large cluster.

vi /etc/llttabset-cluster 2set-node train1link interface1 /dev/interface1:0 - ether - -link interface2 /dev/interface2:0 - ether - - link-lowpri interface2 /dev/interface2:0 - ether - -

LinuxOn Linux, do not prepend the interface with /dev in the link specification.

b Edit /etc/llthosts and ensure that there is a unique entry for all systems in the combined cluster.

vi /etc/llthosts0 train41 train32 train23 train1


c Edit /etc/gabtab and modify the –n option to gabconfig to reflect the total number of systems in combined clusters.

vi /etc/gabtab/sbin/gabconfig -c -n 4

11 Start cluster services (LLT, GAB, fencing, VCS) on the systems in the small cluster and verify cluster memberships.

On train1:

lltconfig -cgabconfig -c -n 4gabconfig -aPort a membership should include the node ID for train1, in addition to the node IDs for train2, train3, and train4.

/etc/init.d/vxfen starthastartgabconfig -a

Both port a and port h memberships should include the node ID for train1, in addition to the node IDs for train2, train3, and train4.

Note: You can also use LLT, GAB, and VCS startup files installed by the VCS packages to start cluster services.

12 Update service group and resource configuration to use all the systems.Note: Service group attributes, such as SystemList, AutoStartList, and SystemZones, and localized resource attributes, such as Device for NIC or IP resource types, may need to be modified.

a Open the cluster configuration.

haconf -makerw

b For the service groups copied from the small cluster (name1SG1, name1SG2, name2SG1, and name2SG2 in this example), add train2, train3, and train4 to the SystemList and AutoStartList attributes:

hagrp -modify groupname SystemList -add train2 \priority2 train3 priority3 train4 priority4hagrp -modify groupname AutoStartList add train2 \train3 train4


C

c For the service groups that existed in the large cluster before the merging (name3SG1, name3SG2, name4SG1, name4SG2, NetworkSG, and ClusterService in this example), add train1 to the SystemList and AutoStartList attributes:

hagrp -modify groupname SystemList -add train1 \ priority1hagrp -modify groupname AutoStartList add train1

d Close and save the cluster configuration.



For all the systems and service groups in the merged cluster, verify operation:

hagrp –switch groupname –to systemname


Lab 2 Solution: Service Group Dependencies C–25Copyright © 2005 VERITAS Software Corporation. All rights reserved.

C

Lab 2 Solution: Service Group Dependencies


Lab 2 Solution: Service Group DependenciesStudents work separately to configure and test service group dependencies.

Brief instructions for this lab are located on the following page:• “Lab 2 Synopsis: Service Group Dependencies,” page A-7

Step-by-step instructions for this lab are located on the following page:• “Lab 2 Details: Service Group Dependencies,” page B-17

Note: If you already have a nameSG2 service group, skip this section.


hastatus -sumhagrp -online nameSG1 -sys your_sysorhagrp -switch nameSG1 -to your_sys



ParentParent

ChildChild

OnlineLocal

OnlineLocal


OfflineLocal

OfflineLocal

nameSG2

nameSG1




C


Allcp /name1/loopy /loopy

Solaris, AIX, HP-UXrcp /name1/loopy their_sys:/

Linuxscp /name1/loopy their_sys:/

3 Record the values for your service group in the worksheet.


haconf -makerw

5 Create the service group using either the GUI or CLI.

hagrp -add nameSG2

6 Modify the SystemList attribute to add the original two systems in your cluster.

hagrp -modify nameSG2 SystemList -add your_sys 0 their_sys 1


Group nameSG2

Required Attributes



Optional Attributes



7 Modify the AutoStartList attribute to allow the service group to start on your system.

hagrp -modify nameSG2 AutoStartList your_sys

8 Verify that the service group can auto start and that it is a failover service group.

hagrp -display nameSG2


Note: In the GUI, the Close configuration action saves the configuration automatically.

haconf -dump -makeroview /etc/VRTSvcs/conf/config/main.cf

10 Create a nameProcess2 resource using the appropriate values in your worksheet.

hares -add nameProcess2 Process nameSG2


hares -modify nameProcess2 Critical 0





Required Attributes

PathName /bin/sh

Optional Attributes

Arguments /name2/loopy name 2

Critical? No (0)

Enabled? Yes (1)


C

12 Set the required attributes for this resource, and any optional attributes, if needed.

hares -modify nameProcess2 PathName /bin/shhares -modify nameProcess2 Arguments "/loopy name 2"

Note: If you are using the GUI to configure the resource, you do not need to include the quotation marks.


hares -modify nameProcess2 Enabled 1


hares -online nameProcess2 -sys your_sys


hares -display nameProcess2


haconf -dump -makeroview /etc/VRTSvcs/conf/config/main.cf


1 Take the nameSG1 and nameSG2 service groups offline.

hagrp -offline nameSG1 -sys online_syshagrp -offline nameSG2 -sys online_sys


haconf -makerw

3 Delete the systems added in Lab 1 from the SystemList attribute for your two nameSGx service groups.


hagrp -modify nameSG1 SystemList -delete other_sys1 other_sys2hagrp -modify nameSG2 SystemList -delete other_sys1 other_sys2


hagrp -link nameSG2 nameSG1 online local firm


hagrp -online nameSG1 -sys your_syshagrp -online nameSG2 -sys your_sys


hagrp -switch nameSG1 -to their_syshagrp -switch nameSG2 -to their_sys

What do you see?

A group dependency violation occurs if you attempt to move either the parent or the child group. You cannot switch groups in an online local firm dependency without taking the parent (nameSG2) offline first.



C

7 Stop the loopy process for nameSG1 on your_sys by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

From your system, type:ps -ef |grep "loopy name 1" kill pid

– The nameSG1 service group is taken offline because of the fault.– The nameSG2 service group is taken offline because it depends on

nameSG1.– The nameSG1 service group fails over and restarts on their_sys.– The nameSG2 service group is started on their_sys after nameSG1

is restarted.

8 Stop the loopy process for nameSG1 on their_sys by sending a kill signal on that system. Watch the service groups in the GUI closely and record how nameSG2 reacts.

From their system, type:ps -ef |grep "loopy name 1" kill pid


nameSG1.– The nameSG1 service group is faulted on all systems in SystemList and

cannot fail over.– The nameSG2 service group remains offline because it depends on

nameSG1.


hagrp -clear nameSG1


hastatus -sum


hagrp -unlink nameSG2 nameSG1



hagrp -link nameSG2 nameSG1 online local soft




hagrp -switch nameSG1 -to their_syshagrp -switch nameSG2 -to their_sys

What do you see?

A group dependency violation occurs if you move either the parent or the child group. You cannot switch groups in an online local soft dependency without taking the parent (nameSG2) offline first.


From your system:ps -ef |grep "loopy name 1" kill pid

– The nameSG1 service group is taken offline because of the fault.– The nameSG1 service group fails over and restarts on their_sys.– After nameSG1 is restarted, nameSG2 is taken offline because

nameSG1 and nameSG2 must run on the same system.– The nameSG2 service group is started on their_sys after nameSG1

is restarted.



C

5 Stop the loopy process for nameSG1 on their system by sending a kill signal. Watch the service groups in the GUI closely and record how the nameSG2 service group reacts.

From their system:ps -ef |grep "loopy name 1" kill pid

– The nameSG1 service group is taken offline because of the fault.– The nameSG1 service group has no other available system and

remains offline.– The nameSG2 service group continues to run.

6 Describe the differences you observe between the online local firm and online local soft service group dependencies.

– Firm: If nameSG1 is taken offline, so is nameSG2.– Soft: The nameSG2 service group is allowed to continue to run until

nameSG1 is brought online somewhere else. Then, nameSG2 must follow nameSG1.




hagrp -offline nameSG2 -sys their_syshastatus -sum






– The nameSG2 service group is taken offline because of the fault.– The nameSG1 service group remains running on your system because

the child is not affected by the fault of the parent. (This is true for online local firm as well.)




hagrp -offline nameSG1 -sys your_syshastatus -sum




C



hagrp -link nameSG2 nameSG1 online local hard

2 Bring both groups online on your system, if they are not already online.

hagrp -switch nameSG2 -to your_syshastatus -sum


hagrp -switch nameSG1 -to their_sys

What do you see?

A group dependency violation occurs if you switched the child without the parent.

hagrp -switch nameSG2 -to their_sys

The parent group can be switched and moves the child with a hard dependency rule.



– The nameSG2 service group is taken offline because of the fault.– If a failover target exists (which it does in this case) then nameSG1 is

taken offline because of the hard dependency rule; if the parent faults (and there is a failover target), take the child offline.



– The nameSG1 service group is brought online on their system.– The nameSG2 service group is started on their_sys after nameSG1

is restarted.

5 Stop the loopy process for nameSG2 on their system by sending the kill signal. Watch the service groups in the GUI and record how nameSG1 reacts.


– The nameSG2 service group is taken offline because of the fault.– The nameSG2 service group has no failover targets, so nameSG1

remains online on the original system.

6 Which differences were observed between the online local firm/soft and online local hard service group dependencies?

– Firm/Soft: The parent failing does not cause the child to fail over.– Hard: The parent failing can cause the child to fail over.








C

1 Create an online global firm dependency between nameSG2 and nameSG1, with nameSG1 as the child group.

hagrp -link nameSG2 nameSG1 online global firm


hagrp -online nameSG1 -sys your_sys hagrp -online nameSG2 -sys your_sys

3 After the service groups are online, attempt to switch either service group to any other system in the cluster.

hagrp -switch nameSG1 -to their_sys hagrp -switch nameSG2 -to their_sys

What do you see?

– The nameSG1 service group can not switch because nameSG2 requires it to stay online.

– The nameSG2 service group can switch; nameSG1 does not depend on it.


From your system:ps -ef |grep "loopy name 1"kill pid


nameSG1.– The nameSG1 service group fails over to their system.– The nameSG2 service group restarts after nameSG1 is online.



5 Stop the loopy process for nameSG1 on their system by sending a kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.

From their system:ps -ef |grep "loopy name 1"kill pid


nameSG1. – The nameSG1 service group is faulted on all systems and remains

offline.– The nameSG2 service group can not start without nameSG1.




hastatus -sum




C


hagrp -link nameSG2 nameSG1 online global soft


hagrp -online nameSG1 -sys your_sys hagrp -online nameSG2 -sys your_sys

3 After the service groups are online, attempt to switch either service group to their system in the cluster.

hagrp -switch nameSG1 -to their_sys hagrp -switch nameSG2 -to their_sys

What do you see?

Either group can be switched because the parent does not need the child running after it has started.

4 Switch the service group to your system.

hagrp -switch nameSGx -to your_sys

5 Stop the loopy process for nameSG1 by sending the kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.


– The nameSG1 service group fails over to their system.– The nameSG2 service group stays running where it was.



6 Stop the loopy process for nameSG1 on their system by sending the kill signal. Watch the service groups in the GUI closely and record how nameSG2 reacts.


– The nameSG1 service group is faulted on all systems and is offline.– The nameSG2 service group stays running where it was.

7 Which differences were observed between the online global firm and online local soft service group dependencies?

The nameSG2 service group stays running when nameSG1 faults with a soft dependency.




hastatus -sum




C

1 Create a service group dependency between nameSG1 and nameSG2 such that, if nameSG1 fails over to the same system running nameSG2, nameSG2 is shut down. There is no dependency that requires nameSG2 to be running for nameSG1 or nameSG1 to be running for nameSG2.

hagrp -link nameSG2 nameSG1 offline local


hagrp -online nameSG2 -sys your_sys hagrp -online nameSG1 -sys their_sys

3 Stop the loopy process for nameSG2 by sending a kill signal. Record what happens to the service groups.

From your system:ps -ef | grep "loopy name 2"kill pid

The nameSG2 service group should have nowhere to fail over, and it should remain offline.


hagrp -clear nameSG2hagrp -online nameSG2 -sys your_sys

5 Stop the loopy process for nameSG1 on their_sys by sending the kill signal. Record what happens to the service groups.

From their system, type:ps -ef | grep "loopy name 1"kill pid

– The nameSG1 service group fails on their system, failing over to your system.

– The nameSG1 service group forces nameSG2 offline on your system. – The nameSG2 service group is brought online on their system.












C




hares -add nameElifNone2 ElifNone nameSG2hares -modify nameElifNone2 PathName /tmp/TwoisHerehares -modify nameElifNone2 Enabled 1hares -link nameDG2 nameElifnon2

hares -add nameFileOnOff1 FileOnOff nameSG1hares -modify nameFileOnOff1 PathName /tmp/TwoisHerehares -modify nameFileOnOff1 Enabled 1hares -link nameDG1 nameFileOnOff1

hatype -modify ElifNone MonitorInterval 5hatype -modify ElifNone OfflineMonitorInterval 5

hagrp -online nameSG2 -sys your_syshagrp -online nameSG1 -sys their_sys

hagrp -switch nameSG1 -to your_sys

hagrp -offline nameSG1 -sys your_syshagrp -offline nameSG2 -sys their_syshares -unlink nameDG1 nameFileOnOff1hares -unlink nameDG2 nameElifNone2hares -delete nameElifNone2hares -delete nameFileOnOff1



Lab 3 Solution: Testing Workload Management C–45Copyright © 2005 VERITAS Software Corporation. All rights reserved.

C

Lab 3 Solution: Testing Workload Management


Lab 2 Solution: Testing Workload ManagementStudents work separately to configure and test workload management using the Simulator.

Brief instructions for this lab are located on the following page:• “Lab 3 Synopsis: Testing Workload Management,” page A-14

Step-by-step instructions for this lab are located on the following page:• “Lab 3 Details: Testing Workload Management,” page B-29



Copy to:___________________________________________


Copy to:___________________________________________




C


PATH=$PATH:/opt/VRTScssimexport PATH


VCS_SIMULATOR_HOME=/opt/VRTScssimexport VCS_SIMULATOR_HOME

3 Start the Simulator GUI.

hasimgui &



4 Add a cluster.

Click Add Cluster.

5 Use these values to define the new simulated cluster:– Cluster Name: wlm – System Name: S1– Port: 15560– Platform: Solaris– WAC Port: -1

6 In a terminal window, change to the simulator configuration directory for the new simulated cluster named wlm.

cd /opt/VRTScssim/wlm/conf/config


C


Source location of main.cf.SGWM.lab file:

___________________________________________cf_files_dir

cp cf_files_dir/main.cf.SGWM.lab /opt/VRTScssim/wlm/conf/config/main.cf

8 From the Simulator GUI, start the wlm cluster.

Select wlm under Cluster Name.Click Start Cluster.


9 Launch the VCS Java Console for the wlm simulated cluster.

Select wlm under Cluster Name.Click Launch Console.



C

11 Notice the cluster name is now VCS. This is the cluster name specified in the new main.cf file you copied into the config directory.

12 Verify that the configuration matches the description shown in the table.There should be eight failover service groups and the ClusterService group running on four systems in the cluster. Two service groups should be running on each system (as per the AutoStartList attribute). Verify your configuration against this chart:


A1 S1 1 S2 2 S3 3 S4 4 S1

A2 S1 1 S2 2 S3 3 S4 4 S1

B1 S1 4 S2 1 S3 2 S4 3 S2

B2 S1 4 S2 1 S3 2 S4 3 S2

C1 S1 3 S2 4 S3 1 S4 2 S3

C2 S1 3 S2 4 S3 1 S4 2 S3

D1 S1 2 S2 3 S3 4 S4 1 S4

D2 S1 2 S2 3 S3 4 S4 1 S4




VCS_SIM_PORT=15560export VCS_SIM_PORT


C


hasim -grp -display -all -attribute FailOverPolicy


View the status in the Cluster Manager.


Right-click a resource and select Fault.

A1 should fail over to S2.

4 If A1 faults again, without clearing the previous fault, where should it fail over? Verify the failover by faulting a critical resource in the A1 service group.



5 Clear the existing faults in the A1 service group. Then, fault a critical resource in the A1 service group. Where should the service group fail to now?

Right-click A1 and select Clear Fault—>Auto.Right-click a resource and select Fault.



Right-click A1 and select Clear Fault—>Auto.


System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2


1 Set the failover policy to load for the eight service groups.

Select each service group from the object tree.From the Properties tab, change the FailOverPolicy attribute to Load.



Group Load

A1 75

A2 75

B1 75

B2 75

C1 50

C2 50

D1 50

D2 50


C

Select each service group from the object tree. From the Properties tab, select Show All Attributes and change the Load attribute.

3 Set S1 and S2 Capacity to 200. Set S3 and S4 Capacity to 100 (the default value).

Click the System icon at the top of the left panel to show the system object tree.Select each system from the object tree. From the Properties tab, select Show all attributes and change the Capacity attribute.



Check the status from Cluster Manager (Cluster Status view).Use the CLI:hasim -sys -display -attribute AvailableCapacity

5 If the A1 service group faults, where should it fail over? Fault a critical resource in the A1 service group to observe.





System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0

System S1 S2 S3 S4

Groups B1 C1 D1

A2 B2 C2 D2

A1

AvailableCapacity

125 -25 0 0


C


Right-click S2 and select Power off.

B1 should fail over to S1.B2 should fail over to S1.A1 should fail over to S3.




Right-click S2 and select Up.Right-click A1 and select Clear Fault—>Auto.Right-click A1 and select Switch To—>S1.Right-click B1 and select Switch To—>S2.Right-click B2 and select Switch To—>S2.

System S1 S2 S3 S4

Groups B1 C1 D1

B2 C2 D2

A2 A1

AvailableCapacity

-25 200 -75 0




System S1 S2 S3 S4

Groups A1 B1 C1 D1

A2 B2 C2 D2

AvailableCapacity

50 50 0 0


C

Leave the load settings as they are but use the Prerequisites and Limits so no more than three service groups of A1, A2, B1, or B2 can run on a system at any one time.

1 Set Limits for each system to ABGroup 3.

Select the S1 system.From the Properties tab, click Show all Attributes.Select the Limits attribute and click Edit.Click the plus button.Click the Key field and enter: ABGroup.Click the Value field and enter: 3.

Repeat steps for S2, S3, and S4. Enter the same limit on each system.




C

2 Set Prerequisites for service groups A1, A2, B1, and B2 to be 1 ABGroup.

Select the A1 group.From the Properties tab, click Show all Attributes.Select the Prerequisites attribute and click Edit.Click the plus button.Click the Key field and enter: ABGroup.Click the Value field and enter: 1.

Repeat steps for the A2, B1, and B2 groups. Enter the same prerequisites for these four groups.



A1 should fail over to S2.A2 should fail over to S3 because the limit is reached on S2.



A1 should fail over to S4.B1 should fail over to S3.B2 should fail over to S4.These failovers occur based on the Load values.



All service groups fail over to S4 except B1. B1 is the last group to attempt to fail over to S4, which has a prerequisite. A1, A2, B1, and B2 can run on the same system. B1 stays offline.


Select File—>Close configuration.


7 Log off from the GUI.

Select File—>Log Out.

8 Stop the wlm cluster.

From the Simulator Java Console, select Stop Cluster.

Lab 4 Solution: Configuring Multiple Network Interfaces C–63Copyright © 2005 VERITAS Software Corporation. All rights reserved.

C

Lab 4 Solution: Configuring Multiple Network Interfaces


Lab 4 Solution: Configuring Multiple Network InterfacesThe purpose of this lab is to replace the NIC and IP resources with their MultiNIC counterparts. Students work together in some portions of this lab and separately in others.

Brief instructions for this lab are located on the following page:• “Lab 4 Synopsis: Configuring Multiple Network Interfaces,” page A-20

Step-by-step instructions for this lab are located on the following page:• “Lab 4 Details: Configuring Multiple Network Interfaces,” page B-37

Solaris


Mobile



Virtual Academy



nameProcess2

AppVol

AppDG

nameProxy2

nameIP2

nameDG2

nameVol2

nameMount2

nameProcess1

nameDG1

nameVol1

nameMount1

nameProxy1

nameIPM1

NetworkMNIC

NetworkPhantom


NetworkSGNetworkSG

NetworkNIC


C








Classroom network


Private nets

Public Net

0123 0123 0123 01230 0 0 0





The following example shows you how the /etc/hosts file looks for the cluster containing systems train11, train12, train13, and train14.


/etc/hosts









/etc/hosts

10.10.11.2 train11_qfe2

10.10.11.3 train11_qfe3

10.10.12.2 train12_qfe2

10.10.12.3 train12_qfe3

10.10.13.2 train13_qfe2

10.10.13.3 train13_qfe3

10.10.14.2 train14_qfe2

10.10.14.3 train14_qfe3


C




c Check the local-mac-address? eeprom setting. Ensure that it is set to true on each system. If not, change this setting to true.

eeprom |grep local-mac-address?eeprom local-mac-address?=true



Use the values in the table to configure a MultiNICB resource.


haconf -makerw

2 Add the resource to the NetworkSG service group.

hares -add NetworkMNICB MultiNICB NetworkSG


hares -modify NetworkMNICB Critical 0


hares -modify NetworkMNICB Device interface1 0 interface2 1


hares -modify NetworkMNICB Enabled 1


hares -display NetworkMNICBifconfig -a






Required Attributes

Device qfe2qfe3

Critical? No (0)

Enabled? Yes (1)


C

7 Set the resource to critical.

hares -modify NetworkMNICB Critical 1


haconf -dump


9 You may configure MultiNICB to use mpathd mode as shown in the following steps.

a Obtain the IP addresses for the /etc/defaultrouter file from you instructor.

__________________________ __________________________

b Modify the /etc/defaultrouter on each system substituting the IP addresses provided within LINE1 and LINE2.


c Set TRACK_INTERFACES_ONLY_WITH_GROUP to yes in /etc/default/mpathd.

TRACK_INTERFACES_ONLY_WITH_GROUP=yes

d Set the UseMpathd attribute for NetworkNMICB to 1.

hares -modify NetworkMNICB UseMpathd 1

e Set the MpathdCommand attribute to /sbin/in.mpathd.

hares -modify NetworkMNICB MpathdCommand \/sbin/in.mpathd

f Save the cluster configuration.

haconf -dump




hares -dep nameIP1hares -offline nameApp1 -sys systemhares -offline nameIP1 -sys system


hares -modify nameProxy1 Enabled 0

3 Edit the nameProxy1 resource and change its target resource name to NetworkMNICB.

hares -modify nameProxy1 TargetResName NetworkMNICB




hares -delete nameIP1

Reconfiguring Proxy




Resource Type Proxy

Required Attributes


Critical? No (0)

Enabled? Yes (1)


C







Required Attributes


Netmask 255.255.255.0


Critical? No (0)

Enabled? Yes (1)

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train5 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62



hares -add nameIPMNICB1 IPMultiNICB nameSG1


hares -modify nameIPMNICB1 Critical 0


hares -modify nameIPMNICB1 Address IP_addresshares -modify nameIPMNICB1 BaseResName NetworkMNICBhares -modify nameIPMNICB1 NetMask 255.255.255.0


hares -modify nameIPMNICB1 Enabled 1


hares -online nameIPMNICB1 -sys your_system


hares -display nameIPMNICB1ifconfig -a


haconf -dump


C


hares -link nameIPMNICB1 nameProxy1hares -link nameIPMNICB1 nameShare1


hagrp -switch nameSG1 -to their_syshagrp -switch nameSG1 -to your_sys(other systems if available)


hares -modify nameIPMNICB1 Critical 1


haconf -dump



Note: Wait for all participants to complete the steps to this point. Then, test the NetworkMNICB resource by performing the following procedure. Each student can take turns to test their resource, or all can observe one test.


ifconfig -a



The nameIPMNICB1 IP address should move to the other interface on the same system.

3 Use ifconfig to determine the status of the interface with the unplugged cable.

The interface should have a failed flag.



The NetworkMNICB resource should fault on the system with the cables removed; nameSG1 should fail over to the system still connected to the network.


What happens?

The NetworkMNICB resource should clear and be brought online again; nameIPMNICB1 should remain faulted.



C


hares -clear nameIPMNICB1











Required Attributes






255.255.255.0

Critical? No (0)

Enabled? Yes (1)


train1 10.10.10.101

train2 10.10.10.102

train3 10.10.10.103

train4 10.10.10.104

train4 10.10.10.105

train6 10.10.10.106

train7 10.10.10.107

train8 10.10.10.108


C



/etc/hosts10.10.10.101 train1_mnica10.10.10.102 train2_mnica10.10.10.103 train3_mnica10.10.10.104 train4_mnica

3 Verify that NetworkSG is online on both systems.

hagrp -display NetworkSG


haconf -makerw

5 Add the NetworkMNICA resource to the NetworkSG service group.

hares -add NetworkMNICA MultiNICA NetworkSG


hares -modify NetworkMNICA Critical 0


hares -modify NetworkMNICA Device interface1 \10.10.10.1xx interface2 10.10.10.1xx

train9 10.10.10.109

train10 10.10.10.110

train11 10.10.10.111

train12 10.10.10.112




hares -modify NetworkMNICA Enabled 1


hares -display NetworkMNICAifconfig -a

HP-UXnetstat -in

10 Make the resource critical.

hares -modify NetworkMNICA Critical 1


haconf -dump


C

In this portion of the lab, modify the Proxy resource in the nameSG1 service group to reference the MultiNICA resource.


hares -dep nameIP1hares -offline nameApp1 -sys systemhares -offline nameIP1 -sys system



3 Edit the nameProxy1 resource and change its target resource name to NetworkMNICA.

hares -modify nameProxy1 TargetResName NetworkMNICA




hares -delete nameIP1

Reconfiguring Proxy




Resource Type Proxy

Required Attributes


Critical? No (0)

Enabled? Yes (1)







Required Attributes




255.255.255.0

Critical? No (0)

Enabled? Yes (1)

System Virtual Address

train1 192.168.xxx.51

train2 192.168.xxx.52

train3 192.168.xxx.53

train4 192.168.xxx.54

train4 192.168.xxx.55

train6 192.168.xxx.56

train7 192.168.xxx.57

train8 192.168.xxx.58

train9 192.168.xxx.59

train10 192.168.xxx.60

train11 192.168.xxx.61

train12 192.168.xxx.62


C


hares -add nameIPMNIC1 IPMultiNIC nameSG1


hares -modify nameIPMNIC1 Critical 0


hares -modify nameIPMNIC1 Address IP_addresshares -modify nameIPMNIC1 MultiNICResName NetworkMNICAhares -modify nameIPMNIC1 NetMask 255.255.255.0


hares -modify nameIPMNIC1 Enabled 1


hares -online nameIPMNIC1 -sys your_system


hares -display nameIPMNIC1ifconfig -a

HP-UXnetstat -in


haconf -dump



hares -link nameIPMNIC1 nameProxy1


hares -link nameIPMNIC1 nameProcess1|App1


hagrp -switch nameSG1 -to their_syshagrp -switch nameSG1 -to your_sys(other systems if available)


hares -modify nameIPMNIC1 Critical 1


haconf -dump

Linking IPMultiNIC


C

Note: Wait for all participants to complete the steps to this point. Then, test the NetworkMNICA resource by performing the following procedure. (Each student can take turns to test their resource, or all can observe one test.)


ifconfig -a HP-UX

netstat -in



The nameIPMNIC1 IP address should move to the other interface on the same system.

3 Use ifconfig (or netstat) to determine the status of the interface with the unplugged cable.

ifconfig -a HP-UX

netstat -inThe base IP address and virtual IP addresses move to the other interfaces.



The NetworkMNICA resource should fault on the system with the cables removed; nameSG1 should fail over to the system still connected to the network.




What happens?

The NetworkMNICA resource should clear and be brought online again; nameIPMNIC1 should remain faulted.


hares -clear nameIPMNIC1



Appendix DJob Aids

D–2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Service Group Dependencies—Definitions

Online local

Manual Operations Automatic Failover

Failover System Exists No Failover System

soft

• Parent group cannot be brought online when child group is offline

• Child group can be taken offline when parent group is online

• Parent group cannot be switched over when child group is online

• Child group cannot be switched over when parent group is online

Parent Fails

• Parent faults and is taken offline• Child continues to run on the original system• No failover

Child Fails

• Child faults and is taken offline

• Child fails over to available system

• Parent follows child after the child is brought successfully online


• Parent continues to run on the original system

• No failover

firm


• Child group cannot be taken offline when parent group is online

• Parent group cannot be switched over when child group is online


Parent Fails

• Parent faults and is taken offline• Child continues to run on the original system• No failover

Child Fails


• Parent is taken offline• Child fails over to an

available system• Parent fails over to the

same system as the child


• Parent is taken offline

• No failover

hard



• Parent group can be switched over when child group is online (child switches together with parent)


Parent Fails

• Parent faults and is taken offline

• Child is taken offline• Child fails over to an


same system as the child


• Child continues to run on the original system

• No failover

Child Fails




same system as child



• No failover

Appendix D Job Aids D–3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

D

Online global



soft



• Parent group can be switched over when child group is online

• Child group can be switched over when parent group is online

Parent Fails



• Parent fails over to an available system



• No failover

Child Fails



• Child fails over to an available system



• No failover

firm



• Parent group can be switched over when child group is online

• Child group cannot be switched over when parent is online

Parent Fails



• Parent fails over to an available system



• No failover

Child Fails



available system• Parent restarts on an

available system



• No failover


Online remote



soft



• Parent group can be switched over when child group is online (but not to the system where child group is online)

• Child group can be switched over when parent group is online (but not to the system where the parent group is online)

Parent Fails



• Parent fails over to available system; if the only available system is where the child is online, parent stays offline



• No failover

Child Fails


• Child fails over to an available system; if the only available system is where the parent is online, the parent is taken offline before the child is brought online. The parent then restarts on a system different than the child. Otherwise, the parent continues to run on the original system



• No failover

firm



• Parent group can be switched over when child group is online (but not to the system where the child group is online)

• Child group cannot be switched over when parent is online

Parent Fails



• Parent fails over to an available system; if the only available system is where the child is online, parent stays offline



• No failover

Child Fails



available system• If the child fails over to

the system where the parent was online, parent restarts on a different system; otherwise parent restarts on the system it was online



• No failover


D

Offline local



• Parent group can only be brought online when child group is offline


• Parent group can be switched over when child group is online (but not to the system where child group is online)

• Child group can be switched over when parent group is online (but not to the system where the parent group is online)

Parent Fails



• Parent fails over to an available system where child is offline; if the only available system is where the child is online, parent stays offline



• No failover

Child Fails


• Child fails over to an available system; if the only available system is where the parent is online, the parent is taken offline before the child is brought online. The parent then restarts on a system different than the child. Otherwise, the parent continues to run on the original system


• Parent continues to run on the original system (assuming that the child cannot fail over to that system due to a FAULTED status)

• No failover


Service Group Dependencies—Failover Process


D

The following steps describe what happens when a service group in a service group dependency relationship is faulted due to a critical resource fault:1 The entire service group is taken offline due to the critical resource fault

together with any of its parent service groups that have an online firm or hard dependency (online local firm, online global firm, online remote firm, or online local hard).

2 Then a failover target is chosen from the SystemList of the service group based on the failover policy and the restrictions brought by the service group dependencies. Note that if the faulted service group is also the parent service group in a service group dependency relationship, the service group dependency has an impact on the choice of a target system. For example, if the faulted service group has an online local (firm or soft) dependency with a child service group that is online only on that system, no failover targets are available.

3 If there are no other systems the service group can fail over to, both the child service group and all of the parents that were already taken offline remain offline.

4 If there is a failover target, then VCS takes any child service group with an online local hard dependency offline.

5 VCS then checks if there are any conflicting parent service groups that are already online on the target system. These service groups can be parent service groups that are linked with an offline local dependency or online remote soft dependency. In either case, the parent service group is taken offline to enable the child service group to start on that system.

6 If there is any child service group with an online local hard dependency, first the child service group and then the service group that initiated the failover are brought online.

7 After the service group is brought online successfully on the target system, VCS takes any parent service groups offline that have an online local soft dependency to the failed-over child.

8 Finally, VCS selects a failover target for any parent service groups that may have been taken offline during steps 1, 5, or 7 and brings the parent service group online on an available system.

9 If there are no target systems available to fail over the parent service group that has been taken offline, the parent service group remains offline.


Appendix EDesign Worksheet: Template

E–10 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Cluster Interconnect Configuration

First system:

/etc/VRTSvcs/comms/llttab Sample Value Your Value

set-node(host name)

set-cluster(number in host name of odd system)

link

link

/etc/VRTSvcs/comms/llthosts Sample Value Your Value

/etc/VRTSvcs/comms/sysname Sample Value Your Value

Appendix E Design Worksheet: Template E–11Copyright © 2005 VERITAS Software Corporation. All rights reserved.

E

Second system:

Cluster Configuration (main.cf)

/etc/VRTSvcs/comms/llttab Sample Value Your Value

set-node

set-cluster

link

link

/etc/VRTSvcs/comms/llthosts Sample Value Your Value

/etc/VRTSvcs/comms/sysname Sample Value Your Value

Types Definition Sample Value Your Value

Include types.cf

Cluster Definition Sample Value Your Value

Cluster

Required Attributes

UserNames


ClusterAddress

Administrators

Optional Attributes

CounterInterval

System Definition Sample Value Your Value

System

System


E


Group

Required Attributes

FailoverPolicy

SystemList

Optional Attributes

AutoStartList

OnlineRetryLimit



Service Group

Resource Name

Resource Type

Required Attributes

Optional Attributes

Critical?

Enabled?


E


Service Group

Resource Name

Resource Type

Required Attributes

Optional Attributes

Critical?

Enabled?



Service Group

Resource Name

Resource Type

Required Attributes

Optional Attributes

Critical?

Enabled?


E


Service Group

Resource Name

Resource Type

Required Attributes

Optional Attributes

Critical?

Enabled?


Resource Dependency Definition

Service Group

Parent Resource Requires Child Resource

Index-1Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Aacceptance test 6-11adding systems 1-19administrator 6-14agent

Disk 4-5DiskReservation 4-5, 4-10IPMultiNIC 4-21IPMultiNICB 4-36LVMCombo 4-9LVMLogicalVolume 4-9LVMVolumeGroup 4-6, 4-8MultiNICA 4-14MultiNICB 4-27, 4-29

AIX, LVMVolumeGroup 4-6application relationships, examples 2-4attribute

AutoFailOver 3-10AutoStart 3-4AutoStartList 3-4AutoStartPolicy 3-5Capacity 3-14CurrentLimits 3-19DynamicLoad 3-15Load 3-14LoadTimeThreshold 3-16LoadWarningLevel 3-16Prerequisites 3-19SystemList 3-4

autodisable 3-4AutoFailOver attribute 3-10automatic startup

policy 3-5AutoStart 3-4AutoStartList attribute 3-4AutoStartPolicy

attribute 3-5Load 3-8Order 3-6Priority 3-7

AvailableCapacity attribute failover policy 3-14

Bbase IP address 4-40best practice

cluster interconnect 6-4commands 6-10external dependencies 6-8failover 6-7knowledge transfer 6-13network 6-6simplicity 6-10storage 6-5test 6-9

CCapacity attribute failover policy 3-14child

offline local fault 2-18online global firm fault 2-15online global soft fault 2-14online local firm fault 2-11online local soft fault 2-10online remote firm fault 2-17online remote soft fault 2-17service group 2-8

clusteradding a system 1-19design sample Intro-5maintenance 6-13merging 1-33replacing a system 5-4single node 5-17testing 6-9

cluster interconnect best practice 6-4communication files, modifying 1-37configure

IPMultiNIC 4-22MultiNICA 4-17MultiNICB 4-33

Critical attribute 6-7critical, resource 6-7CurrentLimits 3-19

Index

Index-2 VERITAS Cluster Server for UNIX, Implementing Local Clusters Copyright © 2005 VERITAS Software Corporation. All rights reserved.

Ddependency

external 6-8offline local 2-18online global 2-14online local 2-10online remote 2-16service group 2-8service group configuration 2-19using resources 2-22

designcluster 6-22network 4-26sample Intro-5

disaster recovery 5-17, 6-22disk group, upgrade 5-7Disk, agent 4-5DiskReservation 4-10downtime, minimize 4-11dynamic load balancing 3-15DynamicLoad 3-15

EElifNone, controlling service groups 2-22enterprise agent, upgrade 5-11event triggers 2-24

Ffailover

best practice 6-7between local network interfaces 4-11, 4-12configure policy 3-21critical resource 6-7IPMultiNIC 4-25MultiNICA 4-20MultiNICB 4-28network 4-11policy 3-11service group 3-10service group dependency 2-9system selection 3-10

FailOverPolicyattribute definition 3-11

Load 3-14Priority 3-12RoundRobin 3-13

faultoffline local dependency 2-18online global firm dependency 2-15online local firm 2-12online local firm dependency 2-11online local hard dependency 2-13online local soft dependency 2-10online remote firm dependency 2-17

fencing, VCS upgrade 5-11FileOnOff, controlling service groups 2-22

GGlobal Cluster Option 6-22

Hhaipswitch command 4-38hardware, upgrade 5-5high availability, reference 6-16, 6-20HP-UX

LVMCombo 4-9LVMLogicalVolume 4-9LVMVolumeGroup 4-8

HP-UX, LVM setup 4-7

Iinstall

manual 5-14manual procedure 5-14package 5-14remote root access 5-14secure 5-12single system 5-14VCS 5-12

installvcs command 5-12interface alias 4-35IP alias 4-35IPMultiNIC

advantages 4-41configure 4-22

VERITAS Cluster Server for UNIX, Implementing Local Clusters Index-3Copyright © 2005 VERITAS Software Corporation. All rights reserved.

definition 4-21failover 4-25optional attributes 4-22

IPMultiNICB 4-36advantages 4-41configuration prerequisites 4-37configure 4-37defined 4-26optional attributes 4-37required attributes 4-36

JJava Console, upgrade 5-11

Kkey. See license. 5-16

Llicense

checking 5-16replace system 5-4system 6-5VCS 5-16

Limits attribute 3-18link, service group dependency 2-20Linux, DiskReservation 4-10Load attribute, failover policy 3-14load balancing, dynamic 3-15Load, failover policy 3-11LoadTimeThreshold 3-16LoadWarning trigger 3-16LoadWarningLevel 3-16local, attribute 4-19LVM setup 4-7LVMCombo 4-9LVMLogicalVolume 4-9LVMVolumeGroup 4-6, 4-8

Mmaintenance 6-13

manualinstall methods 5-14install procedure 5-14

merging clusters 1-33modify communication files 1-37mpathd 4-27MultiNICA

advantages 4-41configure 4-17definition 4-14example configuration 4-40failover 4-20testing 4-42

MultiNICBadvantages 4-41agent 4-29configuration prerequisites 4-33defined 4-26example configuration 4-40failover 4-28modes 4-27optional attributes 4-30required attributes 4-29resource type 4-29sample interface configuration 4-34sample resource configuration 4-35switch network interfaces 4-38testing 4-42trigger 4-39

Nnetwork

best practice 6-6design 4-26failure 4-11multiple interfaces 4-11

Ooffline local

definition 2-18dependency 2-18using resources 2-23

online global firm 2-15online global soft 2-14


online global, definition 2-14online local firm 2-11online local hard 2-13online local soft 2-10online local, definition 2-10online remote 2-16online remote firm 2-17online remote soft 2-16operating system upgrade 5-6overload, controlling 3-16

Ppackage, install 5-14parent

offline local fault 2-18online global firm fault 2-15online global soft fault 2-14online local firm fault 2-12online local hard fault 2-13online local soft fault 2-11online remote firm fault 2-17online remote soft fault 2-17service group 2-8

policyfailover 3-11service group startup 3-4

PostOffline trigger 2-24PostOnline trigger 2-24PreOnline trigger 2-24Prerequisites attribute 3-18primary site 5-17Priority, failover policy 3-11probe, service group startup 3-4

RRDC 6-22references for high availability 6-20removing, system 1-5replace, system 5-4Replicated Data Cluster 6-22report 6-15resource

controlling service groups 2-22IPMultiNIC 4-21network-related 4-14

resource typeDiskReservation 4-5, 4-10IPMultiNICB 4-36LVMCombo 4-9LVMLogicalVolume 4-9LVMVolumeGroup 4-6, 4-8MultiNICA 4-14MultiNICB 4-29

rolling upgrade 5-7RoundRobin, failover policy 3-11

SSCSI-II reservation 4-5secondary site 5-17service group

automatic startup 3-4AutoStartPolicy 3-5controlling with triggers 2-24dependency 2-8dependency configuration 2-20dynamic load balancing 3-15startup policy 3-4startup rules 3-4workload management 3-2

service group dependencyconfigure 2-19definition 2-8examples 2-10limitations 2-21offline local 2-18online global 2-14online local 2-10online local firm 2-11online local soft 2-10online remote 2-16rules 2-19using resources 2-22

SGWM 3-2simulator

model failover 6-7model workload 3-24

single node cluster 5-17

VERITAS Cluster Server for UNIX, Implementing Local Clusters Index-5Copyright © 2005 VERITAS Software Corporation. All rights reserved.

software upgrade 5-5Solaris

Disk 4-5DiskReservation 4-5network 4-26

startupconfigure policy 3-21policy 3-4service group 3-4system selection 3-4

storagealternative configurations 4-4best practice 6-5

switch, network interfaces 4-38system

adding to a cluster 1-19removing from a cluster 1-5replace 5-4

SystemList attribute 3-4

Ttest

acceptance 6-11best practice 6-9examples 6-12

Test, MultiNIC 4-42trigger

controlling service groups 2-24LoadWarning 3-16MultiNICB 4-39PostOffline 2-24PostOnline 2-24PreOnline 2-24

trunking, defined 4-26

Uuninstallvcs command 5-11upgrade

enterprise agent 5-11Java Console 5-11license 5-8operating system 5-6rolling 5-7software and hardware 5-5VCS 5-8VERITAS notification 5-18VxVM disk group 5-7

VVCS

design sample Intro-5install 5-12license 5-4, 5-16upgrade 5-8

VERITAS Global Cluster Option 5-17VERITAS Volume Replicator 5-17VERITAS, product information 5-18virtual IP address, IPMultiNICB 4-35vxlicrep command 5-16VxVM

fencing 5-11upgrade 5-7

Wworkload management, service group 3-2workload, AutoStartPolicy 3-8


Date post:	17-May-2015
Category:	Technology
Upload:	raryal
View:	363 times
Download:	2 times

havcs-410-101 a-2-10-srt-pg_4

Technology