+ All Categories
Home > Documents > Veritas Cluster 2.0

Veritas Cluster 2.0

Date post: 26-Mar-2015
Category:
Upload: samir-ahmed
View: 377 times
Download: 5 times
Share this document with a friend
383
Thank you. We request that you please turn off pagers and cell phones during class.
Transcript
Page 1: Veritas Cluster 2.0

Thank you.

We request that you please turn off pagers and cell phones during class.

Page 2: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 1VCS Terms and Concepts

Page 3: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-3

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster

Services

Using Cluster

ManagerInstalling

VCS

Page 4: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-4

Objectives

After completing this lesson, you will be able to:Define VCS terminology.Describe cluster communication basics.Describe VERITAS Cluster Server architecture.

Page 5: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-5

ClustersLocal Area Network

Fibre SwitchesFibre Switches

SCSI JBODS

Several networked systemsShared storageSingle administrative entityPeer monitoring

Page 6: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-6

Systems

Members of a clusterReferred to as nodesContain copies of:• Communication protocol configuration files• VCS configuration files• VCS libraries and directories• VCS scripts and daemons

Share a single dynamic cluster configurationProvide application services

Page 7: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-7

Service Groups

A service group is a related collection of resources.Resources in a service group must be available to the system.Resources and service groups have interdependencies.

NFS Service Group NFS

IP

Disk

Mount

Share NIC

Page 8: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-8

Service Group Types

Failover• Can be partially or fully online on only one

server at a time• VCS controls stopping and restarting the

service group when components fail

Parallel• Can be partially or fully online on multiple

servers simultaneously• Examples:

– Oracle Parallel Server– Web, FTP servers

Page 9: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-9

Resources

VCS objects that correspond to hardware or software componentsMonitored and controlled by VCSClassified by typeIdentified by unique names and attributesCan depend on other resources within the same service group

Page 10: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-10

Resource Types

General description of the attributes of a resourceExample Mount resource type attributes:• MountPoint • BlockDevice

Other example resource types:• Disk• Share• IP• NIC

Page 11: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-11

Agents

Processes that control resourcesOne agent per resource typeAgent controls all resources of that type.Agents can be added into VCS agent framework.

Disk

c1t0d1s0c1t0d0s0

IP

10.1.2.4

NIC

qfe1hme0

Mount

ResourcesResources /data

AgentsAgents

Page 12: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-12

Dependencies

Resources can depend on other resources.Parent resources depend on child resources.Service groups can depend on other service groups.Resource types can depend on other resource types.Rules govern service group and resource dependencies.No cyclic dependencies are allowed.

Mount

Disk

(Parent)

(Child)

Page 13: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-13

Private Network

Minimum two communication channels with separate infrastructure:• Multiple NICs (not just ports)• Separate hubs, if used

Heartbeat communication determines which systems are members of the cluster.Cluster configuration broadcast updates cluster systems with status of each resource and service group.

Page 14: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-14

Low Latency Transport (LLT)

Provides fast, kernel-to-kernel communicationsIs connection orientedIs not routableUses Data Link Provider Interface (DLPI) over Ethernet

SystemA SystemB

LLT LLTKernelKernel

Private NetworkPrivate NetworkHardwareHardware

Page 15: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-15

Group Membership Services/Atomic Broadcast (GAB)

Manages cluster membershipMaintains cluster stateUses broadcastsRuns in kernel over Low Latency Transport (LLT)

GAB GABKernelKernel

SystemA SystemB

LLT LLT

Private NetworkPrivate NetworkHardwareHardware

Page 16: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-16

VCS Engine (had)

Maintains configuration and state information for all cluster resourcesUses GAB to communicate among cluster systemsIs monitored by hashadow process

hashadow had hashadow had

GAB GABKernelKernel

SystemA SystemB

LLT LLT

Private NetworkPrivate NetworkHardwareHardware

Page 17: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-17

VCS Architecture

Shared Cluster Configuration in Memory

SystemA SystemB

LLT

GAB

LLT

GAB

HardwareHardware

KernelKernel

ResourcesResources

AgentsAgents

hashadow

Disk NIC IP

had

Mount

hashadow

Disk NIC IP

had

/v c1d0t0s0 hme0 10.1.2.4 /v c1d0t0s0 hme0 10.1.2.4

Mount

Page 18: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-18

Summary

You should now be able to:Define VCS terminology.Describe cluster communication basics.Describe VERITAS Cluster Server architecture.

Page 19: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 2Installing VERITAS Cluster Server

Page 20: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-20

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction Installing

VCS

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

Manager

Page 21: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-21

Objectives

After completing this lesson, you will be able to:Describe VCS software, hardware, and licensing prerequisites.Describe the general VCS hardware requirements.Configure SCSI controllers for a shared disk storage environment.Add VCS executable and manual page paths to the environment variables.Install VCS using the installation script.

Page 22: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-22

Software and Hardware Requirements

Software:• Solaris 2.6, 7 and 8 (32-bit and 64-bit)• Recommended:

– Solaris patches– VERITAS Volume Manager (VxVM) 3.1.P1+– VERITAS File System (VxFS) 3.3.1+

Hardware:• Check latest VCS release notes.• Contact VERITAS Support.

Licenses:• Keys are required on a per-system or per-site basis.• Contact VERITAS Sales for new license, or VERITAS

Support for upgrades.

Page 23: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-23

General Hardware LayoutPrivate Ethernet Heartbeat Links

SCSI2

SCSI2

Shared DataDisks

SCSI1

OS Disk

NICS NICS

SCSI1OS

NICS NICS

Public NetworkSYSTEM BSYSTEM A

Page 24: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-24

SCSI Controller Configuration

SCSI2

SCSI2

Shared DataDisks

SCSI1SYSTEM A

SCSI TargetIDs:

12

3

45

75

scsi-initiator-id

7

scsi-initiator-id

0

OS Disk

0

SCSI1OS Disk

SYSTEM B

Page 25: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-25

SCSI Controller Setup

Use unique SCSI IDs for each system.Check the scsi-initiator-id settingusing the eeprom command.Change the scsi-initiator-id if needed.Controller ID can also be changed on a controller-by-controller basis.

Page 26: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-26

Setting Environment VariablesFor Bourne or Korn shell (sh or ksh):• PATH

PATH=$PATH:/sbin:/opt/VRTSvcs/bin:/opt/VRTSlltexport PATH

• MANPATHMANPATH=$MANPATH:/opt/VRTS/manexport MANPATH

• Add to /.profile

For C shell (csh or tcsh):• PATH

setenv PATH \${PATH}:/sbin:/opt/VRTSvcs/bin:/opt/VRTSllt

• MANPATHsetenv MANPATH ${MANPATH}:/opt/VRTS/man

Page 27: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-27

The installvcs UtilityUses pkgadd to install the VCS packages on all the systems in the cluster:• VRTSllt• VRTSgab• VRTSperl• VRTSvcs• VRTSweb• VRTSvcsw• VRTSvcsdc

Requires remote root access to other systems in the cluster while the script is being run (/.rhosts file)Note: Can remove .rhosts files after VCS installation.Configures two private network links for VCS communicationsBrings the cluster up without any services

Page 28: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-28

Installation Settings

Information required by installvcs:Cluster nameCluster numberSystem namesLicense keyNetwork ports for private networkWeb Console configuration:• Virtual IP address• Subnet mask• Network interface

SMTP/SNMP notification configuration (discussed later)

Page 29: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-29

Starting VCS Installation

# ./installvcs

Please enter the unique Cluster Name : mycluster

Please enter the unique Cluster ID(a number from 0-255) : 200

Enter the systems on which you want to install. (system names separated by spaces) : train7 train8

Analyzing the system for install.……Enter the license key for train7 :

XXXX XXXX …Applying the license key to all systems in the

cluster …

Page 30: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-30

Installing the Private Network Following is the list of discovered NICs:Sr. No. NIC Device1. /dev/hme:02. /dev/qfe:03. /dev/qfe:14. /dev/qfe:25. /dev/qfe:36. OtherFrom the list above, please enter the serial number

(the number appearing in the Sr. No. column) of the NIC for

First PRIVATE network link: 1From the list above, please enter the serial number

(the number appearing in the Sr. No. column) of the NIC for

Second PRIVATE network link: 2Do you have the same network cards set up on all

systems (Y/N)? y…

Page 31: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-31

Configuring the Web Console Do you want to configure the Cluster Manager (Web

Console) (Y/N)[Y] ? yEnter the Virtual IP address for the Web Server :

192.168.27.9Enter Subnet [255.255.255.0]: <enter>Enter the NIC Device for this Virtual IP address

(public network) on train7 [hme0]: <enter>Do you have the same NIC Device on all other systems

(Y/ N)[Y] ? yDo you want to configure SNMP and/or SMTP (e-mail)

notification (Y/N)[Y] ? nSummary information for ClusterService Group setup :--------------------------------------------------Cluster Manager (Web Console) :

Virtual IP Address : 192.168.27.9Subnet : 255.255.255.0Public Network link :

train7 train8 : hme0URL to access : http://192.168.27.9:8181/vcs

Page 32: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-32

Completing VCS Installation Installing on train7.Copying VRTSperl binaries......Installing on train8.Copying VRTSperl binaries.....Copying Cluster configuration files... Done.Installation successful on all systems.Installation can start the Cluster components on the

following system/s.train7 train8Do you want to start these Cluster components now

(Y/N)[Y] ? yLoading GAB and LLT modules and starting VCS on

train7:Starting LLT...Start GAB....Start VCSLoading GAB and LLT modules and starting VCS on

trainer2:Starting LLT...Start GAB....Start VCS

Page 33: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-33

Summary

You should now be able to:Describe VCS software, hardware, and licensing prerequisites.Describe the general VCS hardware requirements.Configure SCSI controllers for a shared disk storage environment.Add VCS executable and manual page paths to the environment variables.Install VCS using the installation script.

Page 34: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-34

Lab 2: Installing VCS

SCSI2

SCSI2

Shared DataDisks

SCSI TargetIds:

12

3

45

75

scsi-initiator-id

7

scsi-initiator-id

0

OS Disk

0

SCSI1SCSI1

OS Disk train2train1

# ./installvcs

Page 35: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 3Managing Cluster Services

Page 36: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-36

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster

Services

Using Cluster

ManagerInstalling

VCS

Page 37: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-37

Objectives

After completing this lesson, you will be able to:Describe the cluster configuration mechanismsStart the VCS engine on cluster systems.Stop the VCS engine.Modify the cluster configuration.Describe cluster transition states.

Page 38: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-38

Cluster Configuration

hashadow

LLT

GAB

hashadowhad had

LLT

GAB

main.cf main.cf

Shared Cluster Configuration in Memory

SystemA SystemB

Page 39: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-39

Starting VCS1

System1 System2 System3

main.cf

hastart

hadhashadow1

2

3

Cluster Conf

4

No valid configuration

6

7Private Network

hastart

hadhashadow

5

Page 40: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-40

Starting VCS: Second System1

System1 System2 System3

main.cf

hadhashadow

Private Network

hadhashadow

8

Cluster Conf

ClusterConf

9

main.cf10

Page 41: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-41

Starting VCS: Third System

main.cf

hadhashadow

hadhashadow

main.cf main.cf

hadhashadow

System1 System2 System3

Shared Cluster Configuration in Memory

Private Network

Page 42: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-42

Stopping VCS

SGB

System2

had

System1

had

SGASGAhastop -local1

System1

had

SGA

System2

had

SGB

hastop -local -evacuate2

hastop -local -force

System2

had

SGB

System1

had

3

Page 43: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-43

The hastop Command

The hastop command stops the VCS engine.Syntax:hastop –option [arg] [-option]

Options:• -local [-force | -evacuate]• -sys sys_name [-force | -evacuate]• -all [-force]

Example:hastop -sys train4 -evacuate

Page 44: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-44

Displaying Cluster Status

The hastatus Command Displays status of items in the cluster.

Syntax:hastatus -option [arg] [-option arg]

Options:• -group service_group• -sum[mary]

Example:hastatus -group OracleSG

Page 45: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-45

Protecting the Cluster Configuration

1

Cluster Conf

1

main.cf

.stale

main.cf

3

main.cf

.stale

2

haconf -makerw hares –add … haconf –dump makero

1. Cluster configuration opened; .stale file created2. Resources added to cluster configuration in memory;

main.cf out of sync with memory configuration3. Changes saved to disk; .stale removed

Page 46: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-46

Opening and Saving the Cluster Configuration

The haconf command opens, closes, and saves the cluster configuration.

Syntax:haconf –option [-option]

Options:• -makerw Opens configuration• -dump Saves configuration• -dump –makero Saves and closes

configurationExample:haconf -dump -makero

Page 47: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-47

Starting VCS with a Stale Configuration

1

System1 System2 System3

main.cf

hadhashadow

Cluster Conf

2

3Private Network

hastart

hadhashadow

1

main.cf

.stale

4Cluster

Conf5

main.cf

Page 48: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-48

Forcing VCS to Start on the Local System

1

System1 System2 System3

main.cf

2

3Private Network

hastart -force

hadhashadow

1

main.cf

.stale

Cluster Conf

4

main.cf

Page 49: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-49

Forcing a System to Start1

System1 System2 System3

main.cf

2Private Network

hasys –force System2

hadhashadow

main.cf

.stale

Cluster Conf

1

main.cf

.stale .stale

hadhashadow

hadhashadow

Page 50: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-50

The hasys Command

Alters or queries state of hadSyntax:hasys –option [arg]

Options:• -force system_name• -list• -display system_name• -delete system_name• -add system_name

Example:hasys -force train11

Page 51: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-51

Propagating a Specific Configuration

1. Stop VCS on all systems in the cluster and leave applications running:hastop -all -force

2. Start VCS stale on all other systems:hastart -stale

The -stale option causes these systems to wait until a running configuration is available from which they can build.

3. Start VCS on the system with the main.cfthat you are propagating:hastart

Page 52: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-52

Summary of Start Options

The hastart command starts the had and hashadow daemons.

Syntax:hastart [-option]

Options:• -stale• -force

Example:hastart -force

Page 53: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-53

Validating the Cluster Configuration

The hacf utility checks the syntax of the main.cf file.

Syntax:hacf -verify config_directory

Example:hacf -verify /etc/VRTSvcs/conf/config

Page 54: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-54

Modifying Cluster Attributes

The haclus command is used to view and change cluster attributes.

Syntax:haclus –option [arg]

Options:• -display• -help [-modify]• -modify modify_options• -value attribute• -notes

Example:haclus –value ClusterLocation

Page 55: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-55

Startup States and Transitions

Peer inADMIN_WAIT

UNKNOWN

CURRENT_DISCOVER_WAIT STALE_DISCOVER_WAIT

ADMIN_WAIT ADMIN_WAIT

LOCAL_BUILD

CURRENT_PEER_WAIT

STALE_PEER_WAIT

STALE_ADMIN_WAIT

REMOTE_BUILD

Peer in LOCAL_BUILD

Peer inRUNNING

Peer inRUNNING

Peer inLOCAL_BUILD

Peer inADMIN_WAIT

Valid configuration on disk Stale configuration on disk

hastart

Peer startsLOCAL_BUILD

INITING

Peer inRUNNING

No Peer

DiskError

The only peer inRUNNING state crashesRUNNING

Page 56: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-56

Shutdown States and Transitions

RUNNING

LEAVINGFAULTED

Running config.lost hastop hastop -force

Unexpected exit

ADMIN_WAIT

EXITING

EXITED

EXITING_FORCIBLY

Resources offlined,agents stopped

Page 57: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-57

Summary

You should now be able to:Describe the cluster configuration mechanisms.Start VCS.Stop VCS.Modify the cluster configuration.Explain the transition states of the cluster.

Page 58: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-58

Lab 3: Managing Cluster Services

To complete this lab exercise:Use commands to start and stop cluster services, as described in the detailed lab instructions.Observe the cluster status by running hastatus in a terminal window.

Page 59: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 4Using the Cluster Manager Graphical User

Interface

Page 60: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-60

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster

Services

Using Cluster

ManagerInstalling

VCS

Page 61: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-61

Objectives

After completing this lesson, you will be able to:Install Cluster Manager.Control access to VCS administration.Demonstrate Cluster Manager features.Create a service group.Create resources.Manage resources and service groups.Use the Web Console to administer VCS.

Page 62: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-62

Installing Cluster Manager

Cluster Manager requirements on Solaris:• 128 MB RAM• 1280 x 1024 display resolution• Minimum 8-bit color depth of the monitor;

24-bit is recommended

To install Cluster Manager:pkgadd –d pkg_location VRTScscm

Page 63: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-63

Cluster Manager Properties

Can be run from a remote system:• Windows NT • Solaris system (cluster member or

nonmember)

Can manage multiple clusters from a single workstationUses TCP port 14141 by default; change with such an entry in /etc/services, if desired:vcs 12345/tcp

Page 64: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-64

Controlling Access to VCS: User Accounts

Cluster AdministratorFull privileges

Cluster Operator All cluster, service group, and resource-level operations

Cluster Guest Read-only access; new users created as Cluster Guest accounts by default.

Group Administrator All service group operations for a specified service group, except deleting service groups

Group OperatorOnline and offline service groups and resources; temporarily freeze or unfreeze service groups

Page 65: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-65

VCS User Account Hierarchy

Cluster AdministratorCluster Administrator

Group AdministratorGroup Administrator

Includes privileges for

Includes privileges for

Includes privileges for

Includes privileges for

Cluster OperatorCluster Operator

Group OperatorGroup Operator

Cluster GuestCluster Guest

Page 66: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-66

Adding Users and Setting Privileges

Cluster configuration must be open.Users are added using the hausercommand.hauser –add username

Additional privileges can then be added:haclus -modify Administrators -add userhaclus -modify Operators -add userhagrp -modify group Administrators -add userhagrp -modify group Operators -add user

VCS user account admin is created with Cluster Administrator privilege by vcsinstall utility.

Page 67: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-67

Modifying User Accounts

To display account information:hauser -display user_name

To change a password:hauser -update user_name

To delete a VCS user account:hauser -delete user_name

Page 68: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-68

Controlling Access to the VCS Command Line Interface

No mapping between UNIX and VCS user accounts by default except root, which has Cluster Administrator privilege.Nonroot users are prompted for a VCS account name and password when executing VCS commands using the command line interface.The cluster attribute AllowNativeCliUsers can be set to map UNIX account names to VCS accounts.A VCS account must exist with the same name as the UNIX user with appropriate privileges.

Page 69: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-69

Cluster Manager DemonstrationCluster Manager demonstration:• Configuration and logging on• Creating a service group and a resource• Manual and automatic fail over• Log desk, Command Log, Command

Center, and Cluster ShellRefer to your participants guide as the steps are listed in the notes.If unable to demonstrate in class, the following slides guide you through the demonstration.

Page 70: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-70

Configuring Cluster Manager

2

4 5

hagui&1

3

6

Page 71: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-71

Logging In to Cluster Manager

1

23

6

7

ServiceGroups

HeartbeatsMember Systems

4

ClusterPanel

5

Page 72: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-72

VCS Cluster Explorer

2

3

41

Page 73: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-73

Creating a Service Group

2

4

1

3

5

Page 74: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-74

Creating a Resource

1

2

3

45

6

7

Page 75: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-75

Bringing a Resource Online

12 3

Page 76: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-76

Resource and Service Group Status

1

2

34

Page 77: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-77

Switching the Service Group to Another System

1

2 3

Page 78: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-78

Service Group Switched

1

Page 79: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-79

Changing MonitorInterval

3

4

5

1

2

Page 80: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-80

Setting the Critical Attribute

1

2

Page 81: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-81

Faulted Resources

3

2

1

Page 82: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-82

Clearing a Faulted Resource

2 3

1

Page 83: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-83

Log Desk

1

2

Page 84: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-84

Command Log

2

1

Page 85: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-85

Command Center

3

5

2

1

4

Page 86: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-86

Shell Tool

25

1

3

4

Page 87: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-87

Administering User Profiles

Remove or modify user account.Remove or modify user account.

Add user account.Add user account.

Page 88: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-88

Using the Web Console

Web ConsoleManage existing resources and service groups:• Online, offline• Clearing faults and

probing resources• Switching, flushing,

freezing service groups

Cannot be used to create resources or service groupsRuns on any system with a Java-enabled Web browser

Java ConsoleConfigure service groups and resources:• Add• Delete • Modify

Can be used for all VCS administrative tasksRequires Cluster Manager and Java to be installed on the administration system

Page 89: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-89

Connecting to the Web Console

http://IP_alias:8181/vcshttp://IP_alias:8181/vcs

VCS account and passwordVCS account and password

Page 90: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-90

Cluster Summary

Display RefreshDisplay Refresh

Log entriesLog entries

Navigation buttonsNavigation buttons

Page 91: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-91

System View

Navigation trailNavigation trailSelected ViewSelected View

Page 92: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-92

Summary

You should now be able to:Install Cluster Manager.Control access to VCS administration.Demonstrate Cluster Manager features.Create a service group.Create resources.Manage resources and service groups.Use the Web Console to administer VCS.

Page 93: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-93

Lab 4: Using Cluster Manager

Student Red Student Blue

RedGuiSG BlueGuiSG

/tmp/RedFile /tmp/BlueFile

RedFile BlueFile

Page 94: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 5Service Group Basics

Page 95: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-95

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 96: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-96

Objectives

After completing this lesson, you will be able to:Describe how application services relate to service groups.Translate application requirements to service group resources.Define common service group attributes.Create a service group using the command lineinterface.Perform basic service group operations.

Page 97: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-97

Application Service

Data

Log

NIC

DatabaseSoftware

Database Requests

IP Address

Network

Page 98: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-98

High Availability Applications

VCS must be able to perform these operations:

Start using a defined startup procedure.Stop using a defined shutdown procedure.Monitor using a defined procedure.Share storage with other systems and store data to disk, rather than maintaining it in memory.Restart to a known state.Migrate to other systems.

Page 99: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-99

Example Service Groups

Failover Service Group

DatabaseWebSystemA

Parallel Service Group

DatabaseWebSystemB

Page 100: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-100

Analyzing Applications

1. Specify application services corresponding to service groups.

2. Determine high availability level and service group type, failover or parallel.

3. Specify which systems run which servicesand the desired failover policy.

4. Identify the hardware and software objectsrequired for each service group and their dependencies.

5. Map the service group resources to actual hardware and software objects.

Page 101: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-101

Example Application Services

Web

httpd/datac1t3d0s3192.168.3.56qfe1

Database processes/oracle/data/oracle/logc1t1d0s5c1t2d0s4192.168.3.55qfe1

Database

Service Groups

Application Services

Page 102: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-102

Identify Physical Resources

Database Service Group

DatabaseApplication

File System /oracle/dataContains data

files(s)

File System /oracle/logContains log

file(s)

IP Address 192.168.3.55

Network Port qfe1

Physical Disk 1c1t1d0s5

Physical Disk 2c1t2d0s4

Page 103: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-103

Map Physical Objects to VCS Resources

The database service group in the examplerequires:

Two Disk resources to monitor the availability of the shared log disk and the shared data diskTwo Mount resources that mount, unmount, and monitor the required log and data file systemsA NIC resource to check the network connectivity on port qfe1An IP resource to configure the IP address that will be used by database clients to access the databaseAn Oracle resource to start, stop, and monitor the Oracle database application

Page 104: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-104

Service Groups

Create a service group using the command line interface:• Syntax:hagrp -add group_name

• Example:hagrp –add mySG

Modify service group attributes to define behavior:hagrp –modify group_name attribute \value [values]

Page 105: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-105

SystemList Attribute

Defines the systems that can run the service groupLowest numbered system has highest priority in determining the target system for failover.To define SystemList attribute:• Syntax:

hagrp –modify group_name SystemList \system1 priority1 system2 priority2 …

• Example:hagrp –modify mySG SystemList \train1 0 train2 1

Page 106: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-106

AutoStart and AutoStartListAttributes

A service group is automatically started on a system when VCS is started (if it is not already online somewhere else in the cluster) under the following conditions:• The AutoStart attribute is set to 1.• The system is listed in its AutoStartList attribute.• The system is listed in its SystemList attribute.

To define AutoStart attribute (default is 1):hagrp –modify group_name AutoStart value

To define AutoStartList attribute:hagrp –modify group_name AutoStartList \system1 system2 …

Examples:hagrp –modify myManualSG AutoStart 0hagrp –modify mySG AutoStartList train0

Page 107: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-107

AutoStartIfPartial AttributeAllows VCS to bring a service group with disabled resources onlineAll enabled resources must be probed.Default is 1, enabled.If 0, the service group cannot come online with disabled resourcesTo define AutoStartIfPartial attribute:• Syntax:

hagrp –modify group_name \AutoStartIfPartial value

• Example:hagrp –modify group_name \AutoStartIfPartial 0

Page 108: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-108

Parallel AttributeParallel service groups:• Run on more than one system at the same time• Respond to system faults by:

– Staying online on remaining systems– Failing over to the specified target system

To set the Parallel attribute:• Syntax:hagrp –modify group_name Parallel value

• Example:hagrp –modify myparallelSG Parallel 1

Must set Parallel attribute before adding resourcesDefault value: 0 (failover)

Page 109: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-109

Configuring a Service Group

Check Logs/Fix

Test Switching

Done

Success?

Set Critical Res

Link Resources

Add Service Group Test Failover

Set SystemList

Y

Add/Test Resource

Y

Resource Flow Chart

More?

Set Opt AttributesN

N

Page 110: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-110

Service Group Operations

Service group operations described in the following sections:

Bringing the service group online:hagrp –online group_name –sys system_name

Taking the service group offline:hagrp –offline group_name –sys system_name

Displaying service group properties:hagrp –display group_name

Example command lines:hagrp –online oraclegroup –sys train8hagrp –offline oraclegroup –sys train8hagrp –display oraclegroup

Page 111: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-111

Bringing a Service Group Online

DiskNIC

IP Mount

Oracle

Process

DiskNIC

IP Mount

Oracle

Process

DiskNIC

IP Mount

Oracle

Process

In-ProgressBefore

After

Page 112: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-112

Taking a Service Group Offline

Before

AfterDiskNIC

IP Mount

Oracle

Process

DiskNIC

IP Mount

Oracle

Process

DiskNIC

IP Mount

Oracle

Process

In-Progress

Page 113: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-113

Partially Online Service Groups

DiskNIC

IP Mount

Oracle

Process

A service group is partially online if:One or more nonpersistent resources is online.At least one resource is:• Autostart enabled• Critical• Offline

Page 114: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-114

Switching a Service Group

A manual failover can be accomplished by taking the service group offline on one system, and bringing it online on another system.To switch a service group from one system to another using a single command:• Syntax:hagrp –switch group_name –to system_name

• Example:hagrp –switch mySG –to train8

To switch using Cluster Manager:Right-click on group—>Switch to—>system.

Page 115: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-115

Flushing a Service Group

Misconfigured resources can cause agents’ processes to hang.Flush service group to stop all online and offline processes.To flush a service group using the command line:• Syntax:hagrp –flush group_name –sys system_name

• Example:hagrp –flush mySG –sys train8

To flush a service group using Cluster Manager:Right-click on group—>Flush—>system.

Page 116: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-116

Deleting a Service Group

Before deleting a service group:1. Bring all resources offline.2. Disable resources.3. Delete resources.To delete a service group using the command line:• Syntax:

hagrp –delete group_name• Example:

hagrp –delete mySG

To delete a service group using Cluster Manager:Right-click on group—>Delete.

Page 117: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-117

Summary

You should now be able to:Describe how application services relate to service groups.Translate application requirements to service group resources.Define common service group attributes.Create a service group using the command lineinterface.Perform basic service group operations.

Page 118: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-118

Lab 5: Creating Service Groups Student Red Student Blue

RedNFSSG BlueNFSSG

RedGuiSG BlueGuiSG

Page 119: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 6Preparing Resources

Page 120: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-120

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 121: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-121

Objectives

After completing this lesson, you will be able to:Describe the components required to create and share a file system using NFS.Prepare NFS resources.Describe the VCS network environment.Manually migrate the NFS services between two systems.Describe the process of automating high availability.

Page 122: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-122

Operating System ComponentsRelated to NFS

File system-related resources:• Hard disk partition• File system to be mounted• Directory to be shared• NFS daemons

Network-related resources:• IP address• Network interface

Page 123: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-123

Disk Resources

/dev/(r)dsk/c1t1d0s3 /dev/(r)dsk/c1t1d0s3

System 1Partition 3

disk1

Shared Storage

System 2

Page 124: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-124

File System and Share Resources

/dev/(r)dsk/c1t1d0s3vxfs/data

System 2System 1

Shared Storage

/dev/(r)dsk/c1t1d0s3vxfs/data

Partition 3

nfsdmountd

nfsdmountd

disk1

Page 125: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-125

Creating File System Resources

Format a disk and create a slice:• Needs to be done on one system• Use format command.• Must have the same major and minor

numbers on both systems (for NFS)Create a file system on the slice:• From one system only:mkfs –F fstype /dev/rdsk/device_name

• Can use newfs for UFS file systemsCreate a directory for a mount point on each system:mkdir /mount_point

Page 126: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-126

Sharing the File System1. Mount the file system:

• The file system should not be mounted automatically at boot time.

• Check the file system, if necessary:fsck –F fstype /dev/rdsk/device_namemount –F fstype /dev/dsk/device_name \mount_point

2. Start the NFS daemons, if they are not already running:/usr/lib/nfs/nfsd -a nserver/usr/lib/nfs/mountd

3. Share the file system:share mount_pointNote: The file system should not be shared automatically at boot time.

Page 127: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-127

NFS Resource Dependencies

Share

NFSFile System

Disk Partition

Page 128: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-128

IP Addresses in a VCS Environment

Administrative IP addresses• Associated with the physical network interface, such

as qfe1• Assigned a unique hostname and IP address by the

operating system at boot time• Available only when the system is up and running• Used for checking network connectivity• Called Base or Maintenance IP addresses

Application IP addresses• Added as a virtual IP address to the network

interface, such as qfe1:1• Associated with an application service• Controlled by the high availability software• Migrated to other systems if the current system fails• Also called service group or floating IP addresses

Page 129: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-129

Configuring an Administrative IP Address

1. Create /etc/hostname.interface with the desired interface name:vi /etc/hostname.qfe1train14_qfe1

2. Edit /etc/hosts and assign an IP address to the interface name.vi /etc/hosts…166.98.112.14 train14_qfe1

3. Reboot the system.

Page 130: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-130

Configuring Application IP Addresses

Requires the administrative IP address to be configured on the interfaceDo not create a hostname file.To set up manually:1. Configure the IP address using ifconfig:

ifconfig qfe1:1 inet 166.98.112.114 netmask +2. Bring up the IP address:

ifconfig qfe1:1 plumbifconfig qfe1:1 up

3. Assign a virtual hostname (application service name) to the IP address.vi /etc/hosts…166.98.112.114 nfs_services

Clients use the application IP address to connect to the application services.

Page 131: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-131

NFS ServicesResource Dependencies

ApplicationIP

NFS

Share NetworkInterface

File System

DiskPartition

Page 132: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-132

Monitoring NFS Resources

To verify the file system:mount|grep mount_point

To verify the disk:prtvtoc /dev/dsk/device_nameAlternately:touch /mount_point/sub_dir/.testfilerm /mount_point/sub_dir/.testfile

To verify the share:share | grep mount_point

To verify NFS daemons:ps –ef | grep nfs

Page 133: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-133

Monitoring the NetworkTo verify network connectivity, use ping to connect to other hosts on the same subnet as the administrative IP address:ping 166.98.112.253

166.98.112.253 is alive

To verify the application IP address, use ifconfig to determine whether the IP address is up:ifconfig -a

Page 134: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-134

Migrating NFS Services1. Make sure that the target system is available.2. Make sure that the disk is accessible from the target

system.3. Make sure that the target system is connected to the

network.4. Bring the NFS services down on the first system

following the dependencies:a. Configure the application IP address down.b. Stop sharing the file system.c. Unmount the file system.

5. Bring the NFS services up on the target system following the resource dependencies:a. Check and mount the file system.b. Start the NFS daemons if they are not already running.c. Share the file system.d. Configure and bring the application IP address up.

Page 135: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-135

Automating High AvailabilityResources are created once; this is not part of HA operation.Script the monitoring process:• How often should each resource be monitored?• What is the impact of monitoring on processing

power?• Are there any resources to be monitored on the target

system even before failing over?

Script the start and stop processes.Use high availability software to automate:• Maintain communication between systems to verify

that the target system is available for failover.• Observe dependencies during starting and stopping.• Define actions to take when a fault is detected.

Page 136: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-136

Summary

You should now be able to:Describe the components required to create and share a file system using NFS.Prepare NFS resources.Describe the VCS network environment.Manually migrate the NFS services between two systems.Describe the process of automating high availability.

Page 137: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-137

Lab 6: Preparing NFS Resources

Student Red Student Blue

RedGuiSG

BlueNFSSG

BlueGuiSG

RedNFSSG

c1t8d0s0

/Redfs

c1t15d0s0

/Bluefs

Page 138: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 7Resources and Agents

Page 139: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-139

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 140: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-140

Objectives

After completing this lesson, you will be able to:Describe how resources and resource types are defined in VCS.Describe how agents work.Describe cluster configuration files.Modify the cluster configuration.Use the Disk resource and agent.Use the Mount resource and agent.Create a service group.Configure resources.Perform resource operations.

Page 141: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-141

Resources

Disk

NIC

IP

Mount

Share

NFS

NFS Service Group

Page 142: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-142

Resource Definitions(main.cf)

Mount MyNFSMount (MountPoint = "/test"BlockDevice = "/dev/dsk/c1t2d0s4"FSType = vxfs

)

Type

Attributes

Unique Name

Attribute Values

Page 143: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-143

Nonpersistent andPersistent Resources

Nonpersistent resourcesOperations=OnOff

Persistent resources• Operations=OnOnly• Operations=None

Example types.cf entrytype Disk (

static str ArgList[] = { Partition }NameRule = resource.Partitionstatic str Operations = Nonestr Partition

)

Page 144: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-144

Resource Types

NFS_IP

WEB_IPIP

ORACLE_IP

NFS_NIC_qfe1

NIC

ORACLE_NIC_qfe2

Resource Types Resources

Page 145: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-145

Resource Type Definitions (types.cf)

type Mount (

static str ArgList[] = { MountPoint,BlockDevice, FSType, MountOpt, FsckOpt, SnapUmount }

NameRule = resource.MountPointstr MountPointstr BlockDevicestr FSTypestr MountOptstr FsckOptint SnapUmount = 0

)

AttributeTypes

Name Rule Definition

type Keyword Unique NameArguments Passed to Agent

Page 146: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-146

Bundled Resource TypesApplicationDiskDiskGroupDiskReservationElifNoneFileNoneFileOnOffFileOnOnlyIPIPMultiNIC

MountMultiNICANFSNICPhantomProcessProxyServiceGroupHBShareVolume

Page 147: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-147

AgentsPeriodically monitor resources and send status information to the VCS engine.Bring resources online when requested by the VCS engine. Take resources offline upon request.Restart resources when they fault (depending on the resource configuration).Send a message to the VCS engine and the agent log file when errors are detected.

Page 148: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-148

How Agents Work

VCS Engine

IPAgent

IP OnlineEntry Point

myNFSIPqfe1192.20.47.11

IP myNFSIP (Device = qfe1Address = “192.20.47.11”

)

main.cf

Online myNFSIP

types.cftype IP (static str ArgList[] = {Device, Address, Netmask, Options, ArpDelay,IfconfigTwice }

ifconfig qfe1:1 192.20.47.11 up myNFSIP

Page 149: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-149

Enterprise Agents

Database Edition / HA 2.2 for OracleInformixVERITAS NetBackupOraclePC NetLinkSun Internet Mail Server (SIMS)SybaseVERITAS NetAppApacheFirewall (Checkpoint and Rapture)Netscape SuiteSpot

Page 150: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-150

The main.cf File

Cluster-wide configurationService groupsResources Resource dependenciesService group dependenciesResource type dependenciesResource types—by way of include statements

Page 151: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-151

Cluster Definition(main.cf)

Include for type definition files.include “types.cf”

Cluster name and Cluster Manager userscluster mycluster (

UserNames = { admin = "cDRpdxPmHpzS." }CounterInterval = 5

Systems which are members of the clusterSystem train7System train8

)

Page 152: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-152

Service Group Definition(main.cf)

group MyNFSSG (SystemList = { train8 = 1, train7 = 2 }AutoStartList = { train8 }

Mount MyNFSMount (MountPoint = “/data”BlockDevice = “/dev/dsk/c1t1d0s3”FSType = vxfs

)

Disk MyNFSDisk (Partition = c1t1d0s3

)

MyNFSMount requires MyNFSDisk

ResourceAttributes

ResourceDependencies

Service GroupAttributes

Resources

Service Group

Page 153: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-153

Modifying the Cluster Configuration

Online configuration:• Use Cluster Manager or the command line

interface.Changes are made in memory configuration on each system while cluster is running.

• Save cluster configuration from memory to disk:– File—>Save Configuration– haconf –dump

Offline configuration:• Edit main.cf.• Restart VCS.

Page 154: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-154

Modifying Resource Types

Online configuration:• Use Cluster Manager. • Use hatype command.• Save changes to synchronize in-memory

configuration with configuration files on disk.

Offline configuration:• Edit types.cf to change existing resource

type definitions.• Edit main.cf to add include statements for

new agents with their own types file.• Restart VCS.

Page 155: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-155

Changing Agent Behavior

Use ClusterManager.

hatype -modify Disk MonitorInterval 30Use CLI.

type Disk (static str ArgList[] = { Partition }

NameRule = group.Name +“_”+ resource.Partitionstatic str Operations = Nonestr Partition

int MonitorInterval = 30)

Edit types.cf.

Page 156: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-156

The Disk Resource and AgentFunctions:Online None (Disk type is persistent.)Offline NoneMonitor Determines whether disk is online by reading

from the raw deviceRequired attributes:Partition UNIX partition device name (If no path is

specified, it is assumed to be in /dev/rdsk.)

No optional attributesConfiguration prerequisites: UNIX device file must exist.

Sample configuration:Disk MyNFSDisk (

Partition=c1t0d0s0)

Page 157: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-157

The Mount Resource and AgentFunctions:Online Mounts a file systemOffline Unmounts a file systemMonitor Checks mount status using stat and

stavfs

Required attributes:BlockDevice UNIX file system device nameFSType File system typeMountPoint Directory used to mount the file

systemOptional attributes:FsckOpt, MountOpt, SnapUmount

Page 158: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-158

Mount Resource Configuration

Configuration prerequisites:• Create the file system on the disk partition (or volume).• Create the mount point directory on each system.• Configure the VCS Disk resource on which Mount

depends.• Verify that there is no entry in /etc/vfstab.

Sample configuration:Mount myNFSMount (MountPoint = “/export1”BlockDevice = “/dev/dsk/c1t1d0s3”FSType = vxfsMountOpt = “-o ro”

)

When setting MountOpt with hares, use % to escape arguments starting with dash (-):hares –modify myNFSMount MountOpt %“-o ro”

Page 159: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-159

Configuring a Service Group

Check Logs/Fix

Test Switching

Done

Success?

Set Critical Res

Link Resources

Add Service Group Test Failover

Set SystemList

Y

Add/Test Resource

Y

Resource Flow Chart

More?

Set Opt AttributesN

N

Page 160: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-160

Configuring a Resource

Flush Group

Done

N

Bring Online

Add Resource

Set Non-Critical

Modify Attributes Check Log

Online?

Clear Resource

YWaiting to Online

Disable ResourceEnable Resource

Faulted?

Y

Page 161: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-161

Adding a Resource

Suggest using service group name as a prefix forresource names

Page 162: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-162

Modifying a Resource

Enter values for each required attribute.Modify optional attributes, if necessary.See Bundled Agents Reference Guide for a complete description of all attributes.

Page 163: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-163

Setting the Critical Attribute

If a critical resource is faulted or taken offline due to a fault, the entire service group fails over.By default, all resources are critical. Set the Critical attribute to 0 to make a resource noncritical.

Page 164: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-164

Enabling a Resource

Resources must be enabled in order to be managed by the agent. If necessary, the agent initializes the resource when it is enabled.All required attributes of a resource must be set before the resource is enabled.By default, resources are not enabled.

Page 165: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-165

Bringing a Resource OnlineResources in a failover service group cannot bebrought online if any resource in the service group is:

Online on another systemWaiting to go online on another system

Page 166: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-166

Creating Resource Dependencies

Parent resources depend on child resources:• Child resource must be online before parent

resource can come online.• Parent resource must go offline before child

resource can go offline.

Parent resources cannot be persistent type resources.You cannot link resources in different service groups.Resources can have an unlimited number of parent and child resources.Cyclical dependencies are not allowed.

Page 167: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-167

Linking Resources

Page 168: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-168

Taking a Resource OfflineTake individual resources offline in order, from the top of the dependency tree to the bottom.Use Offline Propagate to take all resources offline. The selected resource:• Must be the top online resource in the

dependency tree• Must have no online parent resources

Page 169: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-169

Clearing FaultsFaulted resources must be cleared before they can be brought online.

Persistent resources are cleared when the problem is fixed and they are probed by the agent.• Offline resources are probed periodically.• Resources can be manually probed.

Page 170: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-170

Disabling a Resource

VCS calls agent on each system in SystemList.Agent calls Close entry point, if present, to reset the resource.Nonpersistent resources brought offline.Agent stops monitoring disabled resources.

Page 171: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-171

Deleting a Resource

Before deleting a resource:• Take all parent resources offline.• Take resource offline.• Disable resource.• Unlink any dependent resources.

Delete all resources before deleting a service group.

Page 172: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-172

Summary

You should now be able to:Describe how resources and resource types are defined in VCS.Describe how agents work.Describe cluster configuration files.Modify the cluster configuration.Use the Disk resource and agent.Use the Mount resource and agent.Create a service group.Configure resources.Perform resource operations.

Page 173: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-173

Lab 7: Configuring Resources Student Red Student Blue

RedNFSSG BlueNFSSG

RedNFSDisk

RedNFSMount

BlueNFSDisk

BlueNFSMount

disk1 disk2

c1t8d0s0

/Redfs

c1t15d0s0

/Bluefs

RedGuiSG BlueGuiSG

Page 174: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 8Network File System (NFS) Resources

Page 175: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-175

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Managing Cluster Service

Resources and Agents

NFS Resources

Using Cluster

ManagerInstalling

VCS

Page 176: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-176

Objectives

After completing this lesson, you will be able to:Prepare NFS services for the VCS environment.Describe the Share resource and agent.Describe the NFS resource and agent.Describe the NIC resource and agent.Describe the IP resource and agent.Configure and test an NFS service group.

Page 177: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-177

NFS Service Group

Disk

NIC

IP

Mount

Share

NFS

Page 178: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-178

NFS Setup for VCS

Major and minor numbers for block devices usedfor NFS services must be the same on each system.

NFSRequest

NFSResponse

Stale File HandleError

NFSRequest

Before Failover After Failover

Page 179: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-179

Major/Minor Numbers for Partitions

Each system must have the same major and minor numberfor the shared partition. Major/minor numbers must also beunique within a system.

On System A:ls -lL /dev/dsk/c1t1d0s3brw-r----- root sys 32,134 Dec 3 11:50/dev/dsk/c1t1d0s3

On System B:ls -lL /dev/dsk/c1t1d0s3brw-r----- root sys 36,134 Dec 3 11:55/dev/dsk/c1t1d0s3

To make the major numbers the same on all systems:haremajor –sd major_number

Example:haremajor –sd 36

Page 180: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-180

Major Numbers for Volumes

Verify that the major numbers match on allsystems:

On System A:grep ^vx /etc/name_to_majorvxdmp 87vxio 88vxspec 89

On System B:grep ^vx /etc/name_to_major

vxdmp 89vxio 90vxspec 91

Page 181: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-181

Changing Major Numbers for Volumes

To make the major numbers the same on all systems:• Before running vxinstall:

– Edit /etc/name_to_major manually and change the VM major numbers to be the same on both systems.

– Reboot the systems where the change was made.• After running vxinstall:

haremajor –vx major_num1 major_num2• Example:

haremajor –vx 91 92

Each system must have the same major number for the shared volume. Major numbers must also be unique within a system.

Page 182: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-182

The Share Resource and Agent

Functions:Online Shares an NFS file system Offline Unshares an NFS file systemMonitor Reads /etc/dfs/sharetab file to check for

an entry for the file system

Required attributes:PathName Pathname of the file system

Optional attributes: OptionsConfiguration prerequisites:• The file system to be shared should not be written

into /etc/dfs/dfstab.• Must have Mount and NFS resources configured

Page 183: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-183

The NFS Resource and AgentFunctions:Online Starts the nfsd and mountd processes if

they are not already runningOffline None (NFS is an OnOnly resource.)Monitor Checks for the nfsd, mountd, lockd, and

statd processes

Required attributes: NoneOptional attributes: Nservers (default=16)

Configuration prerequisites: NoneSample configuration:NFS mySGNFS (

Nservers = 24)

Page 184: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-184

The NIC Resource and Agent

Functions:Online None (NIC is persistent.)Offline NoneMonitor Uses ping to check connectivity and

determine whether the interface is upRequired attributes:Device NIC device nameOptional attributes:NetworkType, PingOptimize, NetworkHosts

Page 185: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-185

NIC Resource Configuration

Configuration prerequisites:• Configure Solaris to plumb the interface during

system boot. Edit these files:– /etc/hosts– /etc/hostname.interface

• Reboot the system.Sample configuration:NIC mySGNIC(Device = qfe1NetworkHosts = { “192.20.47.254”, “192.20.47.253” })

Page 186: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-186

The IP Resource and AgentFunctions:Online Configures a virtual IP address on an

interfaceOffline Removes an IP address from an interface

This is the IP address that users connect to and that fails over between systems in the cluster.

Monitor Determines whether a virtual IP address is present on the interface

Required attributes:Device Name of NICAddress Unique application (virtual) IP address

Optional attributes:NetMask, Options, ArpDelay (default=1s), IfconfigTwice(default=0)

Page 187: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-187

IP Resource Configuration Configuration prerequisites:Configure a NIC resource.Sample configuration:IP mySGIP (Device = qfe1Address = "192.20.47.61")

Page 188: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-188

Configuring an NFS Service Group

Add Service Group

Set SystemList

Set Opt Attributes

Add/Test Resource

Y

N

Resource Flow Chart

More?

hagrp -add mySG

hagrp -modify mySG SystemList sys1 0 sys 2

hagrp -modify mySG Attribute Value

Test

Page 189: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-189

Configuring NFS Resources

NOnline?

Add Resource

Modify Attributes

Set Non-Critical

Bring Online

Enable Resource

Y

hares -add mySGIP IP mySG

hares -modify mySGIP Critical 0

hares -modify mySGIP Attribute Value

hares -modify mySGIP Enabled 1

hares -online mySGIP -sys sys1

Troubleshoot Resources

Done

Page 190: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-190

Troubleshooting Resources

Flush Group

Faulted?

Clear Resource

Disable Resource

Check Log

YWaiting to Online

hagrp -flush mySG -sys sys1

hares -clear mySGIP

Modify Attributes

Bring Online

Enable Resource

hares -modify mySGIP Enabled 0

NOnline?

Y Done

Page 191: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-191

Testing the Service Group

hares -modify mySGNIC Critical 1

hares -modify ……… Check Logs/Fix

Done

Success?

Test Switching

Set Critical Res

Link Resources

N

Y

Test Failover

hares -modify mySGIP Critical 1

hagrp -switch mySG -to sys2

hares -link mySGIP mySGNIC

Page 192: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-192

Summary

You should now be able to:Prepare NFS services for the VCS environment.Describe the Share resource and agent.Describe the NFS resource and agent.Describe the NIC resource and agent.Describe the IP resource and agent.Configure and test an NFS service group.

Page 193: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-193

Lab 8: Creating an NFS Service Group

Student RedRedNFSSG

RedNFSDisk

RedNFSMount

RedNFSNIC

RedNFSIP

RedNFSNFS

RedNFSShare

Student BlueBlueNFSSG

BlueNFSDisk

BlueNFSMount

BlueNFSNIC

BlueNFSIP

BlueNFSNFS

BlueNFSShare

Page 194: Veritas Cluster 2.0

VERITAS Cluster Server for SolarisLesson 9

Event Notification

Page 195: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-195

Overview

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 196: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-196

ObjectivesAfter completing this lesson, you will be able to:

Describe the VCS notifier component.Configure the notifier to signal changes in cluster status. Describe SNMP configuration.Describe event triggers.Configure triggers to provide notification.

Page 197: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-197

NotificationHow VCS performs notification:1. The had daemon sends a message to the notifier daemon

when an event occurs.2. The notifier daemon formats the event message and

sends an SNMP trap or e-mail message (or both) to designated recipients.

had

notifier

SMTPSNMP

had

Page 198: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-198

Message Severity Levels

had had

notifier

SMTP

SNMP

SNMP

SNMP

SevereError

Warning

Information

Service group is online.

Agent has faulted

Resource has faulted.

Error Concurrency violation

Page 199: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-199

Message Queues1. The had daemon stores a message in a queue when an

event is detected.2. The message is sent over the private cluster network to all

other had daemons to replicate the message queue.3. The notifier daemon can be started on another system in

case of failure without loss of messages.

had

notifier

SMTP

SNMP

had

notifier

SMTPSNMP

Replicated Queue

Page 200: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-200

Configuring NotifierThe notifier daemon can be started and monitored by the NotifierMngr resource.Attributes define recipients and severity levels. For example:SmtpServer = "smtp.acme.com"

SmtpRecipients = { "[email protected]" = Warning }

notifier

NIC

NotifierMngr

NIC

NotifierMngr

Page 201: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-201

The NotifierMngr AgentFunctions:Starts, stops, and monitors the notifier daemon

Required attribute:PathName Full path of the notifier daemon

Required attributes for SMTP e-mail notification:SmtpServer Host name of the SMTP e-mail serverSmtpRecipients E-mail address and message severity

level for each recipientRequired attribute for SNMP notification:SnmpConsole Name of the SNMP manager and

message severity level

Page 202: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-202

The NotifierMngr ResourceOptional attributes:MessagesQueue Size of message queue size;

default = 30NotifierListeningPort TCP/IP port number;

default =14144SnmpdTrapPort TCP/IP port to which SNMP traps

are sent; default=162SnmpCommunity Community ID for the SNMP

manager; default = "public"Example resource configuration:NotifierMngr Notify_Ntfr (

PathName = "/opt/VRTSvcs/bin/notifier"

SnmpConsoles = { snmpserv = Information }

SmtpServer = "smtp.your_company.com"

SmtpRecipients = { "vcsadmin@your_company.com" = SevereError }

)

Page 203: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-203

SNMP Configuration

Load MIB for VCS traps into SNMP console.For HP OpenView Network Node Manager, merge events:xnmevents -merge vcs_trapd

VCS SNMP configuration files:• /etc/VRTSvcs/snmp/vcs.mib• /etc/VRTSvcs/snmp/vcs_trapd

Page 204: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-204

Event Triggers

How VCS performs notification:1. VCS determines if notification is enabled.

• If disabled, no action is taken.• If enabled, VCS runs hatrigger with

event-specific parameters.2. The hatrigger script invokes the event-

specific trigger script with parameters passed by VCS.

3. The event trigger script performs the notification tasks.

Page 205: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-205

Types of Triggers

resnotoffResource not offlineResNotOff

loadwarningSystem is overloadedLoadWarning

resstatechange

Resource changed stateResStateChange

nofailoverService group cannot failover

NoFailover

postofflineService group went offlinePostOfflinepostonlineService group went onlinePostOnline

preonlineService group about to come online

PreOnline

violationResource online on more than one system

Violation

injeopardyCluster in jeopardyInJeopardysysofflineSystem went offlineSysOffline

resfaultResource faultedResFaultScript NameDescriptionTrigger

Page 206: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-206

Configuring TriggersTriggers enabled by presence of script file:• ResFault• ResNotOff• SysOffline• InJeopardy• Violation• NoFailover• PostOffline• PostOnline• LoadWarning

Triggers configured by service group attributes:• PreOnline• ResStateChange

Triggers configured by default:• Violation

Page 207: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-207

Sample TriggersSample trigger scripts include example code to send an e-mail message.Mail must be configured on the system invoking trigger to use sample e-mail code.

# Here is a sample code to notify a bunch of users.

# @recipients=("[email protected]");

# $msgfile="/tmp/resnotoff$2";

# `echo system = $ARGV[0], resource = $ARGV[1] > $msgfile`;

#

# foreach $recipient (@recipients) {

# # Must have elm setup to run this.

# `elm -s resnotoff $recipient < $msgfile`;

# }

#`rm $msgfile`;

Page 208: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-208

ResFault TriggerProvides notification that a resource has faultedArguments to resfault:• system: Name of the system where the

resource faulted• resource: Name of the faulted resource

Page 209: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-209

ResNotOff TriggerProvides notification that a resource has not been taken offline If a resource is not offline on one system, the service group cannot be brought online on another. VCS cannot fail over the service group in the event of a fault, because the resource will not come offline.Arguments to resnotoff:• system: Name of the system where the

resource is not offline• resource: Name of the resource that is not

offline

Page 210: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-210

ResStateChange TriggerProvides notification that a resource has changed stateSet at the service group level by the ResStateChange attribute:hagrp serv_grp -modify TriggerResStateChange

Arguments to resstatechange:• system: Name of the system where the

resource faulted• resource: Name of the faulted resource• previous_state: State of the resource

before change• new_state: State of the resource after

change

Page 211: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-211

SysOffline TriggerProvides notification that a system has gone offlineExecuted on another system when no heartbeat is detectedArguments to sysoffline:• system: Name of the system that went

offline• systemstate: Value of the SysState

attribute for the offline system

Page 212: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-212

NoFailover TriggerRun when VCS determines that a service group cannot fail overExecuted on the lowest numbered system in a running state when the condition is detectedArguments to nofailover:• systemlastonline: Name of the last

system where the service group is online or partially online

• service_group: Name of the service group that cannot fail over

Page 213: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-213

SummaryYou should now be able to:

Describe the VCS notifier component.Configure the notifier to signal changes in cluster status. Describe SNMP configuration.Describe event triggers.Configure triggers to provide notification.

Page 214: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-214

Triggersresfaultnofailoversysoffline

Lab 9: Event NotificationStudent Red Student Blue

resfaultnofailoversysoffline

RedNFSSG

ClusterService

BlueNFSSG

webip

webnicnotifier

Page 215: Veritas Cluster 2.0

VERITAS Cluster Server for SolarisLesson 10

Faults and Failovers

Page 216: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-216

Overview

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 217: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-217

ObjectivesAfter completing this lesson, you will be able to:

Describe how VCS responds to faults.Implement failover policies.Set limits and prerequisites.Use system zones to control failover.Control failover behavior using attributes.Clear faults.Probe resources.Flush service groups.Test failover.

Page 218: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-218

How VCS Responds to Resource Faults

1. Calls ResFault trigger, if present.2. Offlines all resources in the path of the fault starting

from the faulted resource up to the top of the dependency tree.

3. If an online critical resource is part of the path, offlines the entire service group in preparation for failover.

4. Starts the service group on another system in the service group’s SystemList (if possible).

5. If no other systems are available, service group remains offline and NoFailover trigger is invoked, if present.

Page 219: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-219

Practice ExerciseStarts

on another system

Taken

offline due

to fault

74F

-4,6,7E

-4,6D

6,74C

-4B

--A

Offline

Non-Critica

l

Case

4

85

3

12

9

7

6

Resource 4 Faults

Page 220: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-220

Practice Answers

All but 7

-

All

-

All

All

Starts on

another system

6

6,7

6,7

-

6,7

6,7

Taken

offline due

to fault

74F

-4,6,7E

-4,6D

6,74C

-4B

--A

Offline

Non-Critica

l

Case

4

8

5

3

1

2

9

7

6

Resource 4 Fails

Page 221: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-221

Failover Attributes

AutoFailOver indicates whether automatic failoveris enabled for the service group.Default value is 1, enabled.FailOverPolicy specifies how a target system is selected:• Priority—System with the lowest priority number in

the list is selected (default).• RoundRobin—System with the least number of active

service groups is selected.• Load—System with greatest available capacity is

selected.Example configuration:hagrp –modify group AutoFailOver 0

hagrp –modify group FailOverPolicy Load

Page 222: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-222

FailOverPolicy: Priority

DB

Svr2

Svr1AP1

SystemList = {Svr1 = 0, Svr2 = 1}

SystemList = {Svr2 = 0, Svr1 = 1}

Svr3

SystemList = {Svr3=0, Svr1=1, Svr2=2}AP2

Lowest numbered system in SystemList selected

Page 223: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-223

FailOverPolicy: RoundRobin

System with fewest running service groupsselected

Svr3Svr1

Svr4Svr2

Page 224: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-224

FailOverPolicy: LoadCapacity = 100AvailableCapacity = 70

Capacity = 200AvailableCapacity = 100

Load = 100LgSvr2

DB2

LgSvr1

Load = 100SmSvr1

AP1

Load = 30

DB1

Capacity = 100AvailableCapacity = 80

Load = 20SmSvr2

AP2

Page 225: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-225

Setting Load and Capacity

The Load and Capacity attributes are user-defined values.Set attributes using the hagrp and hasyscommands.Examples:hasys –modify SmSrv1 Capacity 100

hagrp –modify AP1 Load 30

AvailableCapacity calculated by VCS:Capacity minus Load equals AvailableCapacity

Page 226: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-226

Load-Based Failover ExampleG4 migrates to Svr1 [SystemList = {Svr1, Svr2, Svr3, Svr4}]G5 migrates to Svr3 [SystemList = {Svr1, Svr2, Svr3, Svr4}]

Svr1G1 Load=20G6 Load=30

Capacity = 100AvailableCapacity = 50

Capacity = 100AvailableCapacity = 20

Svr2

Svr3G3 Load=30G7 Load=20

Capacity = 100AvailableCapacity = 50

Svr4

G4 Load=10G5 Load=50

Capacity = 100AvailableCapacity = 40

G2 Load=40G8 Load=40

Page 227: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-227

The LoadWarning TriggerSvr3 runs the LoadWarning trigger when AvailableCapacity is 20 or less (80 percent of Capacity) for 10 minutes (600 seconds).

Svr3G3 Load=30G7 Load=20G5 Load=50

Capacity = 100AvailableCapacity = 0

Capacity = 100AvailableCapacity = 40

Svr1G1 Load=20G6 Load=30G4 Load=10

Capacity = 100AvailableCapacity = 20

Svr4

System Svr3 (Capacity=100LoadWarningLevel=80LoadTimeThreshold=600

)

System Svr3 (Capacity=100LoadWarningLevel=80LoadTimeThreshold=600

)Svr2

G2 Load=40G8 Load=40

Page 228: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-228

Dynamic LoadThe DynamicLoad attribute is used in conjunction with load-estimation software. It is set using the hasys command.

Capacity = 100AvailableCapacity = 10 SmSvr1 is 90 percent loaded.

SmSvr1GAGC GD

hasys -load 90hasys -load 90

Capacity = 200AvailableCapacity = 40

LgSvr2 is 80 percent loaded.

LgSvr2

GB GH hasys -load 160hasys -load 160

Page 229: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-229

Limits and Prerequisites

SmSvr2

AP1 AP2

SmSvr1

Limits = { Mem=75, Processors=6 }CurrentLimits = { Mem=50, Processors=4 }

SmSvr1, SmSvr2SmSvr1, SmSvr2

Limits = { Mem=100, Processors=12 }CurrentLimits = { Mem=50, Processors=8 }

LgSvr2LgSvr1DB1, DB2DB1, DB2

Prerequisites = { Mem=50, Processors=4 }

DB1

LgSvr1, LgSvr2LgSvr1, LgSvr2

DB2

DB1 or DB2 can fail over to either SmSvr1 or SmSvr2.Both AP1 and AP2 can fail over to either LgSvr1 or LgSvr2.

AP1, AP2AP1, AP2Prerequisites = { Mem=25, Processors=2 }

Page 230: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-230

Combining Capacity and Limits

When used together, VCS determines the failover

target as follows:Limits and Prerequisites are used to determine a subset of potential failover targets.Of this subset, the system with the highest value for AvailableCapacity is selected.If multiple systems have the same AvailableCapacity, the first system in SystemList is selected.Limits are hard values—if a system does not meet the Prerequisites, the service group cannot be started on that system.Capacity is a soft limit —the system with the lowest AvailableCapacity is selected, even if AvailableCapacity results in a negative number.

Page 231: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-231

Failover ZonesPreferred FailoverZone for Web Service Group

Preferred FailoverZone for Database Service Group

sysc sysd

syse sysf

sysa sysb

DatabaseWeb

The SystemList for both service groups includes all systems in the cluster.

Page 232: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-232

SystemZones Attribute

Used to define the preferred failover zones for each service group.If the service group is online in a system zone, it fails to other systems in the same zone based on the FailOverPolicy until there are no further systems available in that zone.When there are no other systems for failover in the same zone, VCS chooses a system in a new zone from the SystemList based on the FailOverPolicy.To define SystemZones:• Syntax:hagrp –modify group_name SystemZones \

sys1 zone# sys2 zone# sys zone# …• Example:

hagrp –modify OracleSG SystemZones sysa \0 sysb 0 sysc 1 sysd 1 syse 1 sysf 1

Page 233: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-233

Controlling Failover Behavior with Resource

Type AttributesRestartLimit• Affects how the agent responds to a resource

fault• Default: 0

ConfInterval• Determines the amount of time that a tolerance

or restart counter can be incremented• Default: 600 seconds

ToleranceLimit• Enables the monitor entry point to return

OFFLINE several times before the resource is declared FAULTED

• Default: 0

Page 234: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-234

Restart ExampleRestartLimit=1Resource to be restarted one time withinthe ConfInterval timeframeConfInterval=180Resource can be restarted once within a three minute interval.MonitorInterval=60 seconds (default value)Resource is monitored every 60 seconds.

MonitorInterval

Restart Faulted

Online OfflineOnlineOfflineOnline

ConfInterval

Page 235: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-235

Adjusting MonitoringMonitorInterval• Default value is 60 seconds for most resource

types.• Consider reducing to 10 or 20 seconds for

testing.• Use caution when changing this value:

• Load is increased on cluster systems.• Resources can fault if they cannot respond in the

interval specified.

OfflineMonitorInterval• Default is 300 seconds for most resource types.• Consider reducing to 60 seconds for testing.

Page 236: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-236

Modifying Resource Type Attributes

Can be used to optimize agentsApplied to all resources of the specified typeCommand line example:hatype –modify FileOnOff MonitorInterval 5

Page 237: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-237

Preventing Failover

Frozen service group does not fail over when a critical resource faults.Service group must be unfrozen to enable fail over.To freeze a service group:hagrp -freeze service_group [-persistent]

To unfreeze a service group:hagrp -unfreeze service_group [-persistent]

A persistent freeze:• Requires the cluster configuration to be open• Remains in effect even if VCS stopped and restarted

throughout the cluster

Page 238: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-238

Clearing Faults

Verify that the faulted resource is offline.Fix the problem that caused the fault and clean up any residual effects.To clear a fault, type:hares -clear resource_name [-sys system_name]

To clear all faults in a service group, type:hagrp -clear group_name [-sys system_name]

Persistent resources are cleared by probing:hares -probe resource_name [-sys system_name]

Page 239: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-239

Probing ResourcesCauses VCS to immediately monitor the resource To probe a resource, type:hares –probe resource_name –sys system_name

You can clear a persistent resource by probing it after the underlying problem has been fixed.

Page 240: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-240

Flushing Service Groups

All online/offline agent processes are stopped.All resources in transitional states waiting to go online are taken offline.Propagation of the offline operation is stopped, but resources waiting to go offline remain in the transitional state.You must verify the physical or software resources are stopped at the operating system level after flushing to avoid creating a concurrency violation.To flush a service group, type:hagrp –flush group_name –sys system_name

Page 241: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-241

Testing FailoverUse test resources, such as FileOnOff, when applicable.Set lower values for MonitorInterval, OfflineMonitorInterval, and ConfInterval to detect faults more quickly.Manually online, offline, and switch the service group among all systems.Simulate failure of each resource in the service group. Simulate failover of the entire system.

Page 242: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-242

Testing ExamplesForce a resource to fault.Reboot a system. Halt and reboot a system.Remove power from a system.

Page 243: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-243

SummaryYou should now be able to:

Describe how VCS responds to faults.Implement failover policies.Set limits and prerequisites.Use system zones to control failover.Control failover behavior using attributes.Clear faults.Probe resources.Flush service groups.Test failover.

Page 244: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-244

Triggersresfaultnofailoversysoffline

Lab 10:Faults and Failovers

Student Red Student Blue

resfaultnofailoversysoffline

RedNFSSG BlueNFSSG

Page 245: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 11Installing and Upgrading

Applications in the Cluster

Page 246: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-246

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Resources and Agents

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 247: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-247

Objectives

After completing this lesson, you will be able to:Describe the benefits of keeping applications available during planned maintenance.Freeze service groups and systems.Upgrade a system in a running cluster.Describe the differences in application upgrades.Apply guidelines for installing new applications in the cluster.

Page 248: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-248

Maintenance and Downtime

PlannedPlannedDowntimeDowntime

30%30%

SoftwareSoftware40%40%

People15%

Hardware10%

Environment5%

LAN/WAN Equip.<1%

Client<1%

Page 249: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-249

Operating System Update

Frozen Web Server

Web RequestsOperating System Update

Page 250: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-250

Application Upgrade

DatabaseSGWebSG

Frozen

Update Web Application

Page 251: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-251

Freezing a System

Freezing a system prevents service groups from failing to it.Failover can still occur from a frozen system.Freeze a system while maintenance is being performed.Persistent freeze remains in effect through VCS restarts.Evacuate moves service groups off the frozen system.Syntax:hasys –freeze [–persistent] [-evacuate] systemAhasys –unfreeze [–persistent] systemA

Use hasys to determine if a system is frozen:hasys –display Frozenhasys –display TFrozen

Page 252: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-252

Freezing a Service Group

Freezing a service group prevents it from being taken offline, brought online, or failed over, even if a concurrency violation occurs.Example update scenario:1. Freeze the service group.2. Update the application on the system(s) that are not

currently running the application.3. Unfreeze the service group.4. Move the service group to an updated system and apply

the application update on the original system.Persistent freeze remains in effect, even if VCS is stopped and restarted throughout the cluster.Syntax:hagrp –freeze service_group [–persistent]

Use hagrp to determine if a group is frozen:hagrp –display service_group –attribute Frozenhagrp –display service_group –attribute TFrozen

Page 253: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-253

Upgrading a System—Reboot Required

More systemsTo upgrade?

Move service groupsto appropriate systems: hagrp -switch mySG

-to systemA

Close the configuration:haconf -dump -makero

Freeze and evacuate system:hasys -freeze –persistent

-evacuate systemA

Stop VCS on system:hastop -sys systemA

Perform upgrade.Reboot system.

Unfreeze the system:hasys –unfreeze

–persistent systemA

Open the configuration:haconf -makerw

No

YesStart

Done

Page 254: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-254

Differences in Application Upgrades

Rolling upgradesNo simple reversion from upgradeMultiple installation directoriesUpgrading without rebooting

Page 255: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-255

Installing Applications: Program Files on Shared Storage

Advantages:• Simplifies application setup and

maintenance• Application service group is

self-contained—all program and data files are located on file systems within the service group.

Disadvantages:• Rolling upgrades cannot be performed.• Downtime increased during maintenance

Page 256: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-256

Binaries on Local Storage

Advantages:• Minimizes downtime during application

maintenance• May be able to perform rolling upgrades

(depending on the application)

Disadvantages:• Must maintain multiple copies of the

application• Not scalable due to maintenance overhead

in clusters with large numbers of service groups and systems

Page 257: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-257

Application Installation Guidelines

Determine where to install program files (locally or shared disk) based on your cluster environment.Install application data files on a shared storage partition that is accessible to each system that can run the application.Specify identical installation options.Use the same mount point when installing the application on each system.

Page 258: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-258

Summary

You should now be able to:Describe the benefits of keeping applications available during planned maintenance.Freeze service groups and systems.Upgrade a system in a running cluster.Describe the differences in application upgrades.Apply guidelines for installing new applications in the cluster.

Page 259: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-259

Lab 11: Installing Applications in the Cluster

Student Red Student Blue

Install Volume Manager

RedNFSSG

BlueNFSSG

Page 260: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 12Volume Manager and Process Resources

Page 261: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-261

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 262: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-262

Objectives

After completing this lesson, you will be able to:Describe how Volume Manager enhances high availability.Describe Volume Manager storage objects. Configure shared storage using Volume Manager.Create a service group with Volume Manager resources.Configure Process resources.Configure Application resources.

Page 263: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-263

Volume ManagementPhysical Disks

System1

Virtual Volumes

System2

Page 264: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-264

Volume Manager Objects

SubdiskSubdiskSubdisk

VxVM DisksVxVM Disks

VolumesVolumes

Disk GroupDisk Group

PlexesPlexes

SubdisksSubdisks

Physical Disks

Page 265: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-265

Disk GroupsPhysical

Disks VxVM Disks

Disk Group: testDG

VxVM objects cannot span disk groups.

Disk groups represent management and configuration boundaries.

Disk groups enable high availability.

Disk1

Disk2

Disk3

Page 266: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-266

VxVM Volume

VxVM Disks

Disk Group: testDG

VxVM Volume

Volume1

Physical DisksDisk1

Disk2

Disk3

Page 267: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-267

Volume Manager Configuration

Initialize disk(s).vxdisksetup -i device

Create a disk group.vxdg init disk_group disk_name=device

Create a volume.vxassist -g disk_group make vol_name size

Make a file system.mkfs -F vxfs volume_device

Page 268: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-268

Testing Volume Manager Configuration

On the first system:1. Create a mount point directory.2. Mount the VMVol file system on the first

system.3. Verify that the file system is accessible.4. Unmount the file system.5. Deport the disk group.On the next system(s):1. Create a mount point directory with the

same name.2. Import the disk group.3. Start the volume.4. Mount and verify the file system.5. Unmount the file system.6. Deport the disk group.

Page 269: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-269

Volume Manager Resources

Mount

VMVol

VMDG

VMSG

Proc

Page 270: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-270

DiskGroup Resource and Agent

Functions:Online Imports a Volume Manager disk groupOffline Deports a disk groupMonitor Determines the state of the disk

group using vxdgRequired attributes:DiskGroup Name of the disk group

Optional attributes:StartVolumes, StopVolumes

Configuration Prerequisites:Disk group and volume must be configured.

Page 271: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-271

Volume Resource and Agent

Functions:Online Starts a volumeOffline Stops a volumeMonitor Reads a byte of data from the raw device

interface for the volume

Required attributes:DiskGroup Name of the disk groupVolume Name of the volume

Optional attributes: NoneConfiguration Prerequisites:Disk group and volume must be configured.

Page 272: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-272

Configuring a Service Group

Check Logs/Fix

Test Switching

Done

Success?

Set Critical Res

Link Resources

Add Service Group Test Failover

Set SystemList

Y

Add/Test Resource

Y

Resource Flow Chart

More?

Set Opt AttributesN

N

Page 273: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-273

Configuring a Resource

Flush Group

Done

N

Bring Online

Add Resource

Set Non-Critical

Modify Attributes Check Log

Online?

Clear Resource

YWaiting to Online

Disable ResourceEnable Resource

Faulted?

Y

Page 274: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-274

Process Resource and Agent

Functions:Online Starts a daemon processOffline Stops a processMonitor Determines whether the process is running

using procfsRequired attributes:PathName Full path of the executable fileOptional attributes:• Arguments• Use % to escape dashed arguments:

hares –modify myProc Arguments “%-db –q1h”

Sample Configuration:Process sendmail (

PathName = “/usr/lib/sendmail”Arguments = “-db -q1h”

)

Page 275: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-275

The Application Resource and Agent

Functions:Online Brings an application online using StartProgramOffline Takes an application offline using StopProgramMonitor Monitors the status of the application in a number

of waysClean Takes the application offline using CleanProgram

or kills all the processes specified for theapplication

Required Attributes:StartProgram Name of executable to start applicationStopProgram Name of executable to stop applicationOne or more of the following:MonitorProgram Name of executable to monitor applicationMonitorProcesses List of processes to be monitoredPidFiles List of pid files that contain the process ID

of the processes to be monitored

Optional Attributes:CleanProgram, User

Page 276: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-276

Application Resource Configuration

Configuration prerequisites:• The application should have its own start and stop

programs.• It should be possible to monitor the application by

either running a program that returns 0 for failure and 1 for success or by checking a list of processes.

Sample configuration:Application samba_app (StartProgram = “/usr/sbin/samba start”StopProgram = “/usr/sbin/samba stop”PidFiles = { “/var/lock/samba/smbd.pid” }MonitorProcesses = { “smbd” }

)

Page 277: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-277

Summary

You should now be able to:Describe how Volume Manager enhances high availability.Describe Volume Manager Storage Objects. Configure shared storage using Volume Manager.Create a service group with Volume Manager resources.Configure Process resources.Configure Application resources.

Page 278: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-278

Lab 12: Volume Manager and Process Resources

Student Blue

TestSG

TestDG

TestVol

TestMount

TestLoopy

BlueNFSSG

Student Red

ProdSG

ProdDG

ProdVol

ProdMount

ProdLoopy

RedNFSSG

ProdDG TestDG

/test

TestVol

/prod

ProdVol

Page 279: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 13Cluster Communication

Page 280: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-280

Overview

Event Notification

Faults andFailovers

Cluster Communication

Installing Applications

Troubleshooting

Using VolumeManager

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 281: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-281

Objectives

After completing this lesson, you will be able to:Describe how systems communicate in a cluster.Describe the LLT and GAB configuration files and commands.Reconfigure LLT and GAB.Describe the effects of cluster communication failures.Recover from communication failures.Configure the InJeopardy trigger.Troubleshoot LLT and GAB.

Page 282: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-282

Cluster Communication

had

GAB

LLT

GAB

LLT

had

agentagent agentagentagentagent

agentagent agentagentagentagent

System A System B

Agent Framework Agent Framework

had

GAB

LLT

GAB

LLT

had

agentagent agentagentagentagent

agentagentagentagent agentagentagentagentagentagentagentagent

agentagent agentagentagentagent

agentagentagentagent agentagentagentagent

Page 283: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-283

GAB Membership Status

Determines cluster membership using heartbeat signalsHeartbeats transmitted by LLTMembership determined by cluster ID number

GAB

LLTLLT

GABGAB

LLT

GAB

LLT

System B System CSystem A System D

Cluster 1

Page 284: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-284

Cluster State

GAB tracks all changes in configuration and resource status.Sends atomic broadcast to immediately transmit new configuration and status

12

36

45

123456

123456

123456

Add Resource

Page 285: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-285

Low Latency Transport (LLT)

Provides traffic distribution across all private linksSends and receives heartbeatsTransmits cluster configuration dataDetermines whether connections are reliable (more than one exists) or unreliableRuns in kernel for best performanceConnection-orientedUses DLPI over EthernetNonroutable

Page 286: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-286

Configuring LLT

Required configuration files:• /etc/llttab• /etc/llthosts

Optional configuration file:/etc/VRTSvcs/conf/sysname

Page 287: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-287

The llttab File

set-node train1set-cluster 10# Solaris examplelink qfe0 /dev/qfe:0 - ether - -link hme0 /dev/hme:0 - ether - -start

Page 288: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-288

Setting Node Number and Name

# /etc/llttab

set-cluster 10

set-node /etc/VRTSvcs/conf/sysname

link qfe0 /dev/qfe:0 - ether - -

link hme0 /dev/hme:0 - ether - -

link-lowpri qfe1 /dev/qfe:1 - ether - -start

# /etc/llthosts3 sysa7 sysb

# /etc/VRTSvcs/conf/sysnamesysb

0 - 255

0 - 31

Page 289: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-289

The link Directive

# /etc/llttabset-node 1set-cluster 10# Solaris examplelink qfe0 /dev/qfe:0 - ether - -link hme0 /dev/hme:0 - ether - -link-lowpri qfe1 /dev/qfe:1 - ether - -start

Tag Name Range (all) SAP

Device:Unit Link Type MTU

Page 290: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-290

Low Priority Link

Public network link as redundant private network linkLLT sends only heartbeats on low priority link if other private network links are functional.Rate of heartbeats slower to reduce trafficLow priority link is used for all cluster communication if all private links fail.Public network can be saturated with cluster traffic.Risk of system panics if the same system ID/cluster ID is present on networkConfigured with link-lowpri directive

Page 291: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-291

Other LLT Directives# for verbose messages from # lltconfig, add this line first # in llttab

set-verbose 1

# the following will cause only # nodes 0-7 to be valid for # cluster participation

exclude 8-31

# peerinact specifies how long the link is # down before marked inactive

set-timer peerinact: 1600

# regulates heartbeat interval

set-timer heartbeat:50set-timer heartbeatlo:100

start

Page 292: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-292

The llthosts File

Format:node_number name

Example entries:1 systema2 systemb3 systemc

No spaces before numberHave same entries on all systems Unique node numbers requiredSystem names match llttab, main.cfSystem names match sysname, if used

Page 293: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-293

The sysname File

Enables llttab and llthosts to be identical on all systemsMust be different on each systemContains unique system nameRemoves dependency on UNIX node nameSystem name must be in llthostsSystem name must match main.cf

Page 294: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-294

GAB Configuration

GAB configuration file:/etc/gabtab

GAB configuration command entry:/sbin/gabconfig -c -n seed_number

Seed number is set to number of systems in the cluster. Starts GAB under normal conditionsOther options discussed later

Page 295: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-295

Changing Communication Configuration

Start VCS Stop VCS

Stop GAB Start GAB

Stop LLT Start LLTEdit Files

Page 296: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-296

Stopping GAB and LLT

Stop VCS engine first.Stop GAB on each system:/sbin/gabconfig -U

Stop LLT:/sbin/lltconfig -U

Page 297: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-297

Starting LLT

Edit configuration files on each system before starting LLT on any system.Start LLT on each system in the cluster:/sbin/lltconfig -c

LLT starts if configuration files are correct.

Page 298: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-298

Starting GAB

Start LLT before starting GAB.Start GAB on each system, specifying a value for -n equal to the number of systems in the cluster:/sbin/gabconfig -c -n #

Page 299: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-299

Starting LLT and GAB Automatically

Startup files added when VCS is installed:/etc/rc2.d/S70llt

/etc/rc2.d/S92gab

Page 300: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-300

The LinkHbStatus Attribute

Internal VCS system attribute that provides link status informationUse hasys command to view status:hasys -display system -attribute LinkHbStatus

hme:0 UP qfe:0 UP

Page 301: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-301

The lltstat Command

train12# lltstat -nvv |pg

LLT node information:

Node State Link Status Address

* 0 train12 OPEN

link1 UP 08:00:20:AD:BC:78

link2 UP 08:00:20:AD:BC:79

link3 UP 08:00:20:B7:08:5C

1 train11 OPEN

link1 UP 08:00:20:B4:0C:3B

link2 UP 08:00:20:B4:0C:3B

link3 UP 08:00:20:B4:0C:3BShows which system runs the command

Page 302: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-302

Other lltstat Optionstrain12#lltstat -c

LLT configuration information:node: 20name: train3cluster: 10version: 1.1nodes: 20 - 21max nodes: 32max ports: 3

(…)

train12#lltstat -cLLT configuration information:

node: 20name: train3cluster: 10version: 1.1nodes: 20 - 21max nodes: 32max ports: 3

(…)

train12# lltstat -lLLT link information:Link Tag State Type Pri SAP MTU Addrlen Xmit Recv …..

0 hme0 on ether hipri 0xCAFE 1500 6 3732 3678 0

1 qfe0 on ether hipri 0xCAFE 1500 6 3731 3674 0

2 qfe1 on ether lowpri 0xCAFE 1500 6 1584 6719 0

train12# lltstat -lLLT link information:Link Tag State Type Pri SAP MTU Addrlen Xmit Recv …..

0 hme0 on ether hipri 0xCAFE 1500 6 3732 3678 0

1 qfe0 on ether hipri 0xCAFE 1500 6 3731 3674 0

2 qfe1 on ether lowpri 0xCAFE 1500 6 1584 6719 0

Page 303: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-303

The lltconfig Command

train12# lltconfig -a list

Link 0 (qfe0):

Node 0 : 08:00:20:AD:BC:78 permanent

Node 1 : 08:00:20:AC:BE:76 permanent

Node 2 : 08:00:20:AD:BB:89 permanent

Link 1 (hme0):

Node 0 : 08:00:20:AD:BC:79 permanent

Node 1 : 08:00:20:AC:BE:77 permanent

Node 2 : 08:00:20:AD:BB:80 permanent

Page 304: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-304

GAB Membership Notation

# /sbin/gabconfig -aGAB Port Memberships===============================================Port a gen a36e003 membership 01 ; ;12Port h gen fd57002 membership 01 ; ;12

Nodes 0 and 1

Indicates 10s Digit(0 displayed if node 10is a member of the cluster)

20 Placeholder

had is communicating.

GAB is communicating.

Nodes 21 and 22

Page 305: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-305

Communication Failures

Network partition:Failure of all Ethernet heartbeat links between one or more systems:• Occurs when one or more systems fail• Also occurs when all Ethernet heartbeat

links failSplit brain:• Failure of Ethernet heartbeat links is

misinterpreted as failure of one or more systems.

• Multiple systems start running the same failover application.

• Leads to data corruption if applications using shared storage

Page 306: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-306

Split-Brain Condition

Block20460

INVALID

Changing Block 20460

Changing Block 20460

Shared Storage

Page 307: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-307

Preventing Split-Brain Condition

Redundant heartbeat channels:• Multiple private network heartbeats• Public network heartbeat• Disk heartbeats• Service group heartbeat

SCSI disk reservationJeopardyAutodisablingSeedingPreOnline trigger

Page 308: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-308

Jeopardy Condition

A special type of cluster membership called jeopardy is formed when one or more systems have only a single Ethernet heartbeat link.Service groups continue to run, and the cluster functions normally. Failover and switching at operator request are unaffected.The service groups running on a system in jeopardy are not taken over by another system if a system failure is detected by VCS.

Page 309: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-309

Jeopardy Example

SG_1 SG_2 SG_3

A B C

Regular Membership: A, BJeopardy Membership: C

Page 310: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-310

Network Partition Example

SG_1 SG_2 SG_3

Autodisabled for C Autodisabled for A,B

A B C

1

2

Regular Membership: A, BNo Jeopardy Membership

New Regular MembershipNo Jeopardy Membership

Page 311: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-311

Split Brain Example

CBA

SG_1 SG_2 SG_3SG_3 SG_1 SG_2

Service Groups Not Autodisabled

Regular Membership: A, BNo Jeopardy Membership

New Regular MembershipNo Jeopardy Membership

Page 312: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-312

Recovery Behavior

When a private network is reconnected after anetwork partition, VCS and GAB are stoppedand restarted as such:

Two-system cluster:• System with the lowest LLT node number

continues to run VCS.• VCS is stopped on higher-numbered system.

Multi-system cluster:• Mini-cluster with the most systems running

continues to run VCS. VCS is stopped on the systems in the smaller mini-cluster(s).

• If split into two equal size mini-clusters, the cluster containing the lowest node number continues to run VCS.

Page 313: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-313

Configuring Recovery Behavior

Modify /etc/gabtab. For example:/sbin/gabconfig -c -n 2 –j

Causes high numbered node to panic if GAB tries to start after all Ethernet connections simultaneously stop and then restartSplit brain avoidance mechanism

Page 314: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-314

Preexisting Network Partitions

This condition is caused by failure in private network communication channels while systems are down.A preexisting network partition can lead to split brain when systems are started.VCS uses seeding to prevent split brain condition in the case of a network partition.

Page 315: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-315

Seeding

Prevents split brainOnly seeded systems can run VCS.Systems are seeded only if GAB can communicate with other systems. Seeding determines the number of systems that must be communicating to allow VCS to start.

Page 316: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-316

Manually Seeding the Cluster

To start GAB and seed the system on which the command runs:gabconfig -c –xWarning: Do not use these options in gabtab.

Overrides –n; allows GAB to immediately seed the cluster so VCS can build a running configuration Use when the number of systems available is less than the number specified by –n in /etc/gabtab.Only use on one system in the cluster; others then seed from first system.

Page 317: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-317

The InJeopardy Trigger

To configure, add an injeopardy script to /opt/VRTSvcs/bin/triggers.The trigger is called when a system transitions from regular cluster membership to jeopardy.Arguments are the name of the system in jeopardy and the system state.The trigger is invoked on all systems that are part of jeopardy membership.The InJeopardy trigger is not run when:• A system loses its last network link.• A system loses both private network links at once.• A system transitions from any other state (such as

down state) to jeopardy state.

Page 318: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-318

The lltdump Command

train12# lltdump -f /dev/qfe:0 -V -A -R

DAT C 100 S 01 D 00 P 007 rdy 80000081 seq 000000b9 len 0132 ack 0000007c 01 01 64 05 00 00 00 01 00 07 89 00

DAT C 100 S 01 D 00 P 007 rdy 80000081 seq 000000bb len 0166 01 01 64 05 00 00 00 01 00 07 88 00

DAT C 100 S 01 D 00 P 007 rdy 80000081 seq 000000bc len 0166 ack 00000080 01 01 64 05 00 00 00 01 00 07 89 00

DAT C 100 S 01 D 00 P 007 rdy 80000081 seq 000000bf len 0176 ack 00000083 01 01 64 05 00 00 00 01 00 07 89 00

Page 319: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-319

The lltshow Commandtrain12# lltshow -n 0 |pg

=== LLT node 0:

nid= 0 state= 4 OPEN my_gen= 3a89ec14 peer_gen= 0 flags= 0 links= 3opens= ffffffff readyports= 0 rexmitcnt= 0 nxtlink= 0lastacked= 0 nextseq= 0 recv_seq= 0xmit_head= 0 xmit_tail= 0 xmit_next= 0xmit_count= 0 recv_reseq= 0 oos= 0retrans= 0 retrans2= 0link [0]: hb= 0 hb2= 0 peerinact= 0 lasthb= 0valid= 1 perm= 1 flags= 0 stat= 1arpmode= 0addr= 08 00 20 AD BC 78 00 00 00 00dlpi_hdr= 00 00 00 07 00 00 00 08 00 00 00 14 00 00 00 64 00 00 00 00 08 00 20 AD BC 78 CA FE 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Identifies LLT Packets on Public Network

Page 320: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-320

Common LLT Problems

Node or cluster number out of range:Node number must be between 0 and 31.Cluster number must be between 0 and 255.

Page 321: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-321

Incorrect LLT Specification

Incorrectly specified Ethernet link device:qf3 should be qfeLLT not started:Check /etc/llttab for the startdirective.

Page 322: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-322

Common GAB Problems

No GAB membership• gabconfig -a• gabconfig -c -nN

GAB starts then shuts down• Check cabling

Page 323: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-323

Problems with main.cf

VCS does not start:Check main.cf for incorrect entries.

hacf -verify aborts:Check system names in main.cf to verify that they match llthosts and llttab.

Page 324: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-324

Summary

You should now be able to:Describe how systems communicate in a cluster.Configure the Low Latency Transport (LLT).Configure the Group Membership and Atomic Broadcast (GAB) mechanism.Start and stop LLT and GAB.Configure the InJeopardy trigger.Troubleshoot LLT and GAB.

Page 325: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-325

Lab 13: Cluster CommunicationStudent Red Student Blue

injeopardy

BlueNFSSG

TestSG

TestDG

TestVol

TestMount

TestLoopy

RedNFSSG

ProdDG

ProdVol

ProdMount

ProdLoopy

Page 326: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Lesson 14Troubleshooting

Page 327: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-327

Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

Preparing Resources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 328: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-328

ObjectivesAfter completing this lesson, you will be able to:

Monitor system and cluster status.Apply troubleshooting techniques in a VCS environment.Detect and solve VCS communication problems.Identify and solve VCS engine problems.Correct service group problems.Solve problems with agents.Resolve problems with resources.Plan for disaster recovery.

Page 329: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-329

Monitoring VCS

VCS log filesSystem log filesThe hastatus utilitySNMP trapsEvent notification triggers Cluster Manager

Page 330: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-330

VCS Log Entries

Engine log: /var/VRTSvcs/log/engine_A.log

TAG_D 2001/04/03 12:17:44 VCS:11022:VCS engine (had) started

TAG_D 2001/04/03 12:17:44 VCS:10114:opening GAB library

TAG_C 2001/04/03 12:17:45 VCS:10526:IpmHandle::recv peer exited errno 10054

TAG_E 2001/04/03 12:17:52 VCS:10077:received new cluster membership

TAG_E 2001/04/03 12:17:52 VCS:10080:Membership: 0x3, Jeopardy: 0x0

TAG_D 2001/04/03 12:17:52 VCS:10322:Node '1' changed state from 'UNKNOWN' to 'INITING'

TAG_B 2001/04/03 12:17:52 VCS:10455:Operation 'haclus -modify(0xc13)' rejected. Sysstate=CURRENT_DISCOVER_WAIT,Channel=BCAST,Flags=0x40000

Most Recent

Page 331: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-331

Agent Log Entries

Agent logs kept in /var/VRTSvcs/logLog files named AgentName_A.logLogLevel attribute settings:• none• error (default setting)• info• debug• all

To change log level:hatype -modify res_type LogLevel debug

Page 332: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-332

Troubleshooting Guide

Primary types of problems:• Cluster communication• VCS engine startup• Service groups and resources

Determine path based on hastatus output:• Cluster communication problem indicated by

message:Cannot connect to server -- Retry Later

• VCS engine startup problem indicated by systems with WAIT status

• Service group and resource problems indicated when VCS engine in RUNNING state

Page 333: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-333

# gabconfig -aGAB Port Memberships===================================

Run gabconfig –a.No port a membership indicates a communication problem.No port h membership indicates a VCS engine (had) startup problem.

# gabconfig -aGAB Port Memberships===================================Port a gen 24110002 membership 01Port h gen 65510002 membership

Communication Problem:GAB Not Seeded

VCS Engine Not Running:GAB and LLT Functioning

Cluster Communication Problems

Page 334: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-334

Problems with GAB and LLTIf GAB is not seeded (no port memberships):• Run lltconfig to determine if LLT is running.• Run lltstat -n to determine if systems can see each

other on the LLT link.• Check the physical network connection(s) if LLT cannot see

each node.• Check gabtab for correct seed value (-n) if LLT links are

functional.Manually seed the cluster, if necessary.

lltconfigLLT is running

lltstat -nLLT node information:

Node State Links* 0 train11 OPEN 21 train12 OPEN 2

Page 335: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-335

VCS Engine Startup Problems

Start the VCS engine using hastart.Check hastatus to determine system state.If not running:• If ADMIN_WAIT or STALE_ADMIN_WAIT,

see next sections.• Check logs.• Verify that the llthosts file exists and

system entries match cluster configuration (main.cf).

• Check gabconfig.

Page 336: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-336

STALE_ADMIN_WAIT

To recover from STALE_ADMIN_WAIT state:1. Visually inspect the main.cf file to

determine whether it is valid.2. Edit the main.cf file, if necessary.3. Verify the syntax of main.cf, if modified.

hacf –verify config_dir4. Start VCS on the system with the valid

main.cf file:hasys -force system_name

5. All other systems perform a remote build from the system now running.

Page 337: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-337

ADMIN_WAIT

A system can be in the ADMIN_WAIT state under these circumstances:• A .stale flag exists and the main.cf file has a

syntax problem. • A disk error occurs affecting main.cf during a

local build.• The system is performing a remote build and

last running system fails.

Restore main.cf and use procedure for STALE_ADMIN_WAIT.

Page 338: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-338

Service Group Not Configured to AutoStart or Run

Service group not onlined automatically when VCS starts:Check AutoStart and AutoStartList attributes:hagrp –display service_group

Service group not configured to run on the system:• Check the SystemList attribute.• Verify that the system name is included.

Page 339: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-339

Service Group AutoDisabled

Autodisable occurs when:• GAB sees a system but had is not running on

the system.• Resources of the service group are not fully

probed on all systems in the SystemList.• A particular system is visible through disk

heartbeat only.Make sure that the service group is offline on all systems in SystemList attribute.Clear the AutoDisabled attribute:hagrp –autoenable service_group -sys system

Bring the service group online.

Page 340: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-340

Service Group Waiting for Dependencies

Check service group dependencies:hagrp -dep service_group

Check resource dependencies:hares -dep resource

Page 341: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-341

Service GroupNot Fully Probed

Usually a result of misconfigured resource attributesCheck ProbesPending attribute:hagrp -display service_group

Check which resources are not probed:hastatus -sum

Check Probes attribute for resources:hares -display

To probe resources:hares –probe resource -sys system

Page 342: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-342

Service Group FrozenVerify value of Frozen and TFrozen attributes: hagrp -display service_group

Unfreeze the service group:hagrp -unfreeze group [-persistent]

If you freeze persistently, you must unfreeze persistently.

Page 343: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-343

Service Group Is Not Offline Elsewhere

Determine which resources are online/offline:hastatus -sum

Verify the State attribute:hagrp -display service_group

Offline the group on the other system:hagrp -offline

Flush the service group:hagrp -flush service_group -sys system

Page 344: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-344

Service Group Waiting for Resource

Review Istate attribute of all resources to determine which resource is waiting to go online.Use hastatus to identify the resource.Make sure the resource is offline (at the operating system level).Clear the internal state of the service group:hagrp –flush service_group -sys systemBring all other resources in the service group offline and try to bring these resources online on another system.Verify that the resource works properly outside VCS.Check for errors in attribute values.

Page 345: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-345

Incorrect Local Name

1. Create /etc/VRTSvcs/conf/sysname with the correct system name shown in main.cf.

2. Stop the local system.3. Start VCS.4. List all system names.5. Open the configuration.6. Delete any systems with incorrect names.7. Save the configuration.

Page 346: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-346

Concurrency ViolationsOccurs when a failover service group is online or partially online on more than one systemNotification provided by the Violation trigger:• Invoked on the system that caused the concurrency

violation• Notifies the administrator and takes the service

group offline on the system causing the violation• Configured by default with the violation script in /opt/VRTSvcs/bin/triggers

• Can be customized:– Send message to the system log.– Display warning on all cluster systems.– Send e-mail messages.

Page 347: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-347

Service Group Waiting for Resource to Go Offline

Identify which resource is not offline:hastatus –summary

Check logs.Manually bring the resource offline, if necessary.Configure ResNotOff trigger for notification or action.

Page 348: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-348

Agent Not Running

Determine whether the agent for that resource is FAULTED:hastatus –summary

Use the ps command to verify that the agent process is not running.Verify values for ArgList and ArgListValuestype attributes:hatype –display res_type

Restart the agent:haagent –start res_type -sys system

Page 349: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-349

Problems Bringing Resources Online

Possible causes of failure while bringingresources online:

Waiting for child resourcesStuck in a WAIT stateAgent not running

Page 350: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-350

Problems Bringing Resource Offline

Waiting for parent resources to come offlineWaiting for a resource to respondAgent not running

Page 351: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-351

Critical Resource Faults

Determine which critical resource has faulted:hastatus –summary

Make sure that the resource is offline. Examine the engine log.Fix the problem.Verify that the resources work properly outside of VCS.Clear fault in VCS.

Page 352: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-352

Clearing Faults

After external problems are fixed:1. Clear any faults on nonpersistent resources.

hares -clear resource -sys system

2. Check attribute fields for incorrect or missing data.

If service group is partially online:1. Flush wait states:

hagrp -flush service_group -sys system

2. Bring resources offline first before bringing them online.

Page 353: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-353

Planning for Disaster Recovery

Back up key VCS files:• types.cf and customized types files• main.cf

• main.cmd

• sysname

• LLT and GAB configuration files• Customized trigger scripts• Customized agents

Use hagetcf to create an archive.

Page 354: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-354

The hagetcf Utility# hagetcf

Saving 0.13 MB

Enter path where configuration can be saved (default is /tmp):

Collecting package info

Checking VCS package integrity

Collecting VCS information

Collecting system configuration

…..

Compressing /tmp/vcsconf.train12.tar to /tmp/vcsconf.train12.tar.gz

Done. Please e-mail /tmp/vcsconf.train12.tar.gz to your support provider.

Page 355: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-355

Summary

You should now be able to:Monitor system and cluster status.Apply troubleshooting techniques in a VCS environment.Identify and solve VCS engine problems.Correct service group problems.Solve problems with agents.Resolve problems with resources.Plan for disaster recovery.

Page 356: Veritas Cluster 2.0

Lab Exercise

Lesson 14Troubleshooting

Page 357: Veritas Cluster 2.0

VERITAS Cluster Server for Solaris

Appendix DSpecial Situations

Page 358: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-358

Overview

This lesson provides a guide for managing certain situations in a cluster environment:

VCS upgradesVCS patchesSystem changes: Adding, removing, and replacing cluster systems

Page 359: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-359

Objectives

After completing this lesson, you will be able to:Upgrade VCS software to version 2.0 from any earlier versions.Install a VCS patch.Add systems to a running VCS cluster.Remove systems from a running VCS cluster.Replace systems in a running VCS cluster.

Page 360: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-360

Preparations for VCS Upgrade

Acquire the new VCS software.Contact VERITAS Technical Support.Read the release notes.Write scripts to automate as much of the process as possible.If available, deploy on a test cluster first.

Page 361: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-361

VCS Upgrade Process

Done

I. Complete initial preparation.Start

II. Stop the existing VCS software.

III. Remove the existing VCS software and add the new VCS version.

IV. Verify the configuration and make changes as needed.

V. Start VCS on one system and propagate the configuration to others.

Page 362: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-362

Step I - Initial Preparation1. Open the cluster configuration and freeze all

service groups persistently:haconf –makerwhagrp –listhagrp –freeze group_name -persistent

2. Save and close the VCS configuration:haconf –dump -makero

3. Make a backup of the full configuration, including:

• All configuration files• Any custom-developed agents• Any modified VCS scripts

4. Rename the existing types.cf file:mv /etc/VRTSvcs/conf/config/types.cf \

/etc/VRTSvcs/conf/config/types.save

Page 363: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-363

Step II - Stopping VCS Software1. Stop the VCS engine on all systems leaving the

application services running:hastop –all -force

2. Remove heartbeat disk configurations:gabdiskhb –lgabdiskx –lgabdiskhb –d disk_namegabdiskx –d device_name

3. Stop GAB and unload GAB:gabconfig –Umodinfo | grep gabmodunload -i modid

4. Stop and unload LLT.lltconfig –Umodinfo | grep lltmodunload -i modid

Page 364: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-364

Step III - Removing Old and Adding New VCS Software

1. Remove the existing VCS (pre-2.0) software packages.pkgrm VRTScscm VRTSvcs VRTSgab VRTSllt \

VRTSperl

2. Add the new VCS software packages.pkgadd –d /package_directory

Page 365: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-365

Step IV - Verifying and Changing the Configuration

1. Determine differences between existing and new types.cf files:diff /etc/VRTSvcs/conf/config/types.save \

/etc/VRTSvcs/conf/config/types.cf

2. Merge the new and old versions of types.cffiles:

a. Check changes in attribute names.b. Check modified resource type attributes.

3. Compare and merge any necessary changes to VCS scripts.

4. Verify the configuration files:hacf –verify /etc/VRTSvcs/conf/config

Page 366: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-366

Step V - Starting The VCS Cluster

1. On all systems in the cluster, start LLT and GAB:lltconfig -c

gabconfig –c –n #

2. Start the VCS engine on the system where the changes were made:

hastart

3. Start the VCS engine on all other systems in the cluster in a stale state:

hastart -stale

4. Open the configuration, unfreeze the service groups, and save and close the configuration:

haconf –makerw

hagrp –unfreeze group_name –persistent

haconf –dump -makero

Page 367: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-367

Installing a VCS Patch

Done

I. Carry out the initial preparation

Start

Same as in VCS Upgrade

Same as in VCS Upgrade II. Stop the old VCS software

III. Install and verify the new patch

IV. Start the VCS software

Page 368: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-368

Step III - Installing and Verifying the New Patch

1. Verify that VRTS* packages are all version 2.0.pkginfo –l VRTSgab VRTSllt VRTSvcs \

VRTSperl | grep VERSION

2. Add the new VCS patch on each system using the provided utility../vcs_install_patch

3. Verify that the new patch has been installed.showrev –p | grep VRTS

Page 369: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-369

Step IV - Starting the VCS Cluster

1. Start LLT, GAB, and VCS on all systems in the cluster.lltconfig –c

gabconfig –c –n #

hastart

2. Open the configuration, unfreeze the service groups, and save and close the configuration:haconf –makerw

hagrp –unfreeze group_name –persistent

haconf –dump -makero

Page 370: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-370

Adding Systems to a Running VCS Cluster

1. Configure LLT with the same cluster number and a unique node id on the new system.

2. Configure GAB.3. Connect the new system to the private

network.4. Edit /etc/llthosts files on all systems in

the cluster to add the system name and node ID of the new system.

5. Start LLT, GAB, and VCS on the new system.6. Change the SystemList attribute for each

service group that can run on the new system.

Page 371: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-371

Removing Systemsfrom a Running VCS Cluster

1. Switch all running service groups to other systems and freeze the system.

2. Stop VCS on the system using hastop -local.3. Stop GAB on the system:

gabconfig –Umodinfo | grep gabmodunload -i modid

4. Stop and unload LLT on the system:lltconfig –Umodinfo | grep lltmodunload –i modid

5. Remove the system from the cluster configuration:hasys –delete system_name

6. Edit /etc/llthosts on all systems to delete the entry for the system to be removed.

7. Remove llttab and gabtab files on that system.

Page 372: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-372

Replacing Systemsin a Running VCS Cluster

1. Evacuate any service groups running on the system to be replaced.

2. Make the VCS configuration read/write, freeze the system persistently, save and close the configuration.

haconf –makerwhasys –freeze system_name –persistenthaconf –dump -makero

3. Physically replace the system with a new one using the same VCS configuration (same cluster number, node id, and system name).

4. Connect the new system to the private network.5. Start LLT, GAB, and VCS on the new system.6. Make the VCS configuration read/write, unfreeze the

system, save and close the configuration.haconf –makerwhasys –unfreeze system_name –persistenthaconf –dump -makero

Page 373: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-373

Summary

You should now be able to:Upgrade VCS software to version 2.0.Install a VCS patch.Add systems to a running VCS cluster.Remove systems from a running VCS cluster.Replace systems in a running VCS cluster.

Page 374: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-374

Lab: Installing VCS PatchesStudent Red Student Blue

Install Patch

RedSG

BlueSG

Page 375: Veritas Cluster 2.0

VERITAS Cluster Serverfor Solaris

Introduction

Page 376: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-376

VERITAS Cluster Server

Public NetworkPublic Network

VCSPrivateNetwork

VCSPrivateNetwork

Shared StorageShared Storage

Applications/ServicesApplications/Services NFS WWW FTP DB

ClientsClients

Page 377: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-377

VCS Features

Clustered Databases

Clustered Web Servers

NetworkNetwork

Availability• Monitor and restart

applications• Set failover policies

Scalability• Distribute services• Add systems and

storage to running clusters

Manageability• Use Java or Web

graphical interfaces• Manage multiple

clusters

Page 378: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-378

High Availability DesignHA-aware applications• Restart capability• Crash-tolerance

HA management software• Site replication• Fault detection, notification, and failover• Storage management• Backup and recovery

Redundant hardware• Power supplies• Network interface cards, hubs, switches• Storage

Page 379: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-379

VERITAS Clustering and Replication Products

Foundation ProductsVERITAS Volume Manager and File System

Parallel ExtensionsVERITAS Cluster Volume Manager and File System

Data ReplicationVERITAS VVR & Support for Array-Based Replication

Cluster ManagementVERITAS Global Cluster Manager

Application Availability AgentsInformix, Oracle, Sybase, Apache

High Availability ClusteringVERITAS Cluster Server

Page 380: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-380

VERITAS High Availability Solutions

VxVMVxFSVxVMVxFS

DB

VCSVCS

VolumeReplicatorVolume

Replicator

WAN

Global Cluster

Manager

Global Cluster

ManagerWWW

VCSVCS

VxVMVxFSVxVMVxFS

Tokyo

London

Page 381: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-381

References for High Availability

Blueprints for High Availability: Designing Resilient Distributed Systemsby Evan Marcus and Hal SternHigh Availability Design, Techniques, and Processesby Floyd Piedad and Michael HawkinsDesigning Storage Area Networks by Tom ClarkStorage Area Network Essentials: A Complete Guide to Understanding and Implementing SANsby Richard Barker and Paul MassigliaVERITAS High Availability FundamentalsWeb-based training

Page 382: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-382

Course Overview

Event Notification

Faults andFailovers

Using VolumeManager

Installing Applications

Troubleshooting

Cluster Communication

Service Group Basics

Terms and

ConceptsIntroduction

PreparingResources

Resources and Agents

NFS Resources

Managing Cluster Service

Using Cluster

ManagerInstalling

VCS

Page 383: Veritas Cluster 2.0

VCS_2.0_Solaris_R1.0_20011130© Copyright 2001 VERITAS Software I-383

Lab Overview

Public Network

train2

Blue Student

Even/highnumberedsystem

train1Odd/lownumberedsystem

Red Student

Private Network

SCSI JBOD


Recommended