2
Why Oracle Cluster Ready Services? - Support for Linux
Appeared initially to support Oracle Parallel Server 8.1.7 on Linux
Looked as an exotic configuration at that time
Benefits: - one can install Parallel Server on Linux
3
Why Oracle Cluster Ready Services? - Ment To Support RAC
The lower prices and higher speed of the communication equipment gave Oracle's “share everything” architecture huge advantage – it started to scale well in 9i.
The customers were still afraid of the complicated setup (vendor specific clusterware, raw devices to share the storage) and the high price (option of Oracle EE)
Oracle's answer Oracle Cluster Ready Services OCFS and ASM to share the storage RAC as a part of Oracle Server SE included in the
price The results
Tenths of installations all around Bulgaria Thousands of installations all around the world RAC becomes commodity
4
Why Oracle Cluster Ready Services? - Generic Code
Generic code means generic bugs
Generic bugs are easier and faster to find – no matter on which platform you run, you can hit it – that means more testers
Generic bugs are easier to fix – one fix for all platforms
Generic code is cheaper to support – only one team vs. many platform specific teams
6
Why Oracle Cluster Ready Services? - Single Support Resource
No more ping-pong between the hardware (and clusterware) vendor and Oracle
No more different experts to configure different parts
It all comes by Oracle
7
What is Oracle Clusterware?
Enables one system to be composed by many machines
Enables one Service to be provided by many nodes
Enables processes to be failed over to surviving node in case of failures
Enables network interfaces to be failed over to surviving node
Monitors all the resources and relocates them as needed
Notifies the cluster members, client applications and all the subscribers for resource status changes
Creates a base for cluster-enabled applications (such as RAC)
Enables cluster level resource startup/shutdown
8
Oracle Clusterware Hardware Concepts
One or more (generally 2 or more) servers
Inter-node communication media (most often high speed network)
Public network interface
Shared storage resources
9
Oracle Clusterware Software Concepts -The Oracle Cluster Registry (OCR)
Contains the cluster configuration (the section SYSTEM)
Contains the Oracle Database and Services resource definitions (The section DATABASE)
Contains the Third Party resources definition (The CRS Section)
Ocrdump utility – dumps the OCR in text or XML format and lets us to browse its structure and contents
10
Oracle Clusterware Software Concepts -The Voting Disk
The need of Voting Disk: In case of node interconnect failure, nodes cannot find out if the node is down or the IC is down. Hence each can decide that the other is down and try to recover the cluster. The cluster would split to sub-clusters – “brain split”
The Voting disk – a file, shared between the nodes,at the shared storage where each node writes “heart beat”
Ensures a second communication path between the nodes, to determine which one should go down and which will stay and recover Should be mirrored (at Oracle or OS level) to prevent corruption. With Voting disk unaccessible the cluster goes down
11
Oracle Clusterware Processes on Linux and UNIX Systems
crsd—Performs high availability recovery and management operations such as maintaining the OCR and managing application resources. This process runs as LocalSystem. This process restarts automatically upon failure.
evmd—Event manager daemon. This process also starts the racgevt process to manage FAN server callouts.
ocssd—Manages cluster node membership and runs as the oracle user. Uses IC and the Voting disk; failure of this process results in a node restart.
12
Oracle Clusterware Processes on Linux and UNIX Systems
oprocd—Process monitor for the cluster. Note that this process only appears on platforms that do not use third-party vendor clusterware with Oracle Clusterware.
13
Oracle Clusterware Processes on Linux- Processes startup
From the Linux man pagesDESCRIPTION The inittab file describes which processes are started at bootup and during normal operation...... An entry in the inittab file has the following format:
id:runlevels:action:process....... Valid actions for the action field are: respawn The process will be restarted whenever it terminates (e.g. getty)......
14
Oracle Clusterware Processes on Linux- Processes startup
[oracle@class01 bin]$ cat /etc/inittab......# Run xdm in runlevel 5x:5:respawn:/etc/X11/prefdm -nodaemonh1:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1 </dev/nullh2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1 </dev/nullh3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1 </dev/null
15
Oracle Clusterware Processes startup on Windows
Oracle Process Manager Daemon (OPMD)—OPMD is registered with the Windows Service Control Manager (WSCM) and the startup of all OracleClusterware services are dependent on OPMD. On system startup, and after the default time period of 60 seconds has elapsed, OPMD automatically starts all of the registered Oracle Clusterware services. This startup delay enables other services to start that are outside of the scope of Oracle control, such as storage access, anti-virus, or firewall services. You can set OPMD to start manually.However, this will delay the startup of the rest of the affected Oracle Clusterware
16
The RACG Infrastructure
Takes care of the Oracle Specific Resources
One racgimon process is spawned for each database or ASM instance to monitor its health
[oracle@class01 ~]$ ps -ef|grep racg
oracle 5822 1 0 11:31 ? 00:00:04 /u01/app/oracle/product/11.1/db_1/bin/racgimon startd racdb
17
The RACG Infrastructure
CRSD also spawns other child processes to perform different actions (kill, start/stop resources, change configurations etc.)
Racgeut to kill timeoutet actions Usage racgeut [-e ...=...] <timeout> <prog_exe> <param_list>
Racgmain to start/stop/check/manage resources Usage racgmain [resource name] start|stop|check racgmain startorp|failsrvsa dbname instname [srvname] racgmain startorp|failsrvsa nodename racgmain cond_resname cond_state func [args...]
Racgvip (run as root) to check and relocate the VIP
18
The Virtual IP (VIP) Concept
The VIP is an IP address, controlled by the CRS
Should be from the public subnet
Should be resolvable trough DNS or /etc/hosts
Used by the RAC database to avoid TCP/IP timeouts when recognizing node or interface down events
Used by the third party applications, to still be reached at the same IP, although moved to the surviving node in case of failover
Should be used instead of the static public IP
19
Using CRS with Third Party APPSOverview
An application profile should be added to the OCR. The main attributes are:
Action Program – an executable to start/stop/check the application
Privileges – which user can start/stop the application
Resource – a resource name for your application
20
Using CRS with Third Party APPSCreating the profile
[oracle@class01 ~]$ crs_profile -create apache_crs -t application -dir ./ -a /root/apache_crs.sh -r ora.class01.vip[oracle@class01 ~]$ lltotal 164-rw-r--r-- 1 oracle oinstall 760 Aug 19 17:24 apache_crs.capdrwxr-xr-x 2 oracle oinstall 4096 Aug 13 18:58 Desktop-rw-r--r-- 1 oracle oinstall 43387 Aug 14 12:47 ocr_bef.dmp-rw-r--r-- 1 oracle oinstall 56929 Aug 15 16:19 OCRDUMPFILE[oracle@class01 ~]$
21
Using CRS with Third Party APPSRegistering the Profile
[oracle@class01 ~]$ crs_register apache_crs -dir ./
[oracle@class01 ~]$ crs_stat |grep -A 5 apacheNAME=apache_crsTYPE=applicationTARGET=OFFLINESTATE=OFFLINE
NAME=ora.class01.LISTENER_CLASS01.lsnr
22
Using CRS with Third Party APPSSetting the Permitions
Setting the owner[root@class01 oracle]# /u01/app/oracle/product/11.1/crs11/bin/crs_setperm apache_crs -o root
Setting the rights[root@class01 oracle]# /u01/app/oracle/product/11.1/crs11/bin/crs_setperm apache_crs -u user:oracle:r-x
23
Using CRS with Third Party APPSStarting and Stopping the resource
Checking the state[oracle@class01 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application OFFLINE OFFLINE
Starting the resource[oracle@class01 ~]$ crs_start apache_crsAttempting to start `apache_crs` on member `class01`Start of `apache_crs` on member `class01` succeeded.
[oracle@class01 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application ONLINE ONLINE class01
24
Using CRS with Third Party APPSStarting and Stopping the resource
Stopping the resource
[oracle@class01 ~]$ crs_stop apache_crsAttempting to stop `apache_crs` on member `class01`Stop of `apache_crs` on member `class01` succeeded.[oracle@class01 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application OFFLINE OFFLINE
25
Using CRS with Third Party APPSFailover
Step 1: Apache and the VIP running on node 1[oracle@class02 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application ONLINE ONLINE class01ora....01.lsnr application ONLINE ONLINE class01ora....s01.gsd application ONLINE ONLINE class01ora....s01.ons application ONLINE ONLINE class01ora....s01.vip application ONLINE ONLINE class01
Here we pull the power supply cable from the node 1
26
Using CRS with Third Party APPSFailover
Step 2: Apache and the VIP goes offline[oracle@class02 ~]$ crs_stat -t Name Type Target State Host------------------------------------------------------------apache_crs application ONLINE OFFLINEora....01.lsnr application ONLINE OFFLINEora....s01.gsd application ONLINE OFFLINEora....s01.ons application ONLINE OFFLINEora....s01.vip application ONLINE OFFLINE
27
Using CRS with Third Party APPSFailover
Step 3: Apache and the VIP goes on-line at node 2[oracle@class02 ~]$ crs_stat -t Name Type Target State Host------------------------------------------------------------apache_crs application ONLINE ONLINE class02ora....01.lsnr application ONLINE ONLINE class01ora....s01.gsd application ONLINE ONLINE class01ora....s01.ons application ONLINE ONLINE class01ora....s01.vip application ONLINE ONLINE class02
NOTE: Customer should not change the IP it requests via the browser. Apache is still accessible at the VIP IP
28
Using CRS with Third Party APPSVIP Note
Oracle does not recommend using same VIP for more applications. In our case we use the database VIP to operate with the APACHE as well.
To complain with that we should create new VIP, dedicated for the APACHE server and use it instead of the database VIP. It would operate exactly the same as the database VIP but would be different
29
Using CRS with Third Party APPSFailover
Step 4: Node 1 comes back. VIP goes back to node 1. Apache is still present at node 2. Apache is not reachable at the VIP at that moment[oracle@class02 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application ONLINE ONLINE class02ora....01.lsnr application ONLINE OFFLINEora....s01.gsd application ONLINE OFFLINEora....s01.ons application ONLINE OFFLINEora....s01.vip application ONLINE ONLINE class01
30
Using CRS with Third Party APPSFailover
Step 5: Apache also goes back to Node 1 since it is declared to be dependent on the node 1 VIP. It is reachable again[oracle@class02 ~]$ crs_stat -tName Type Target State Host------------------------------------------------------------apache_crs application ONLINE ONLINE class01ora....01.lsnr application ONLINE ONLINE class01ora....s01.gsd application ONLINE ONLINE class01ora....s01.ons application ONLINE ONLINE class01ora....s01.vip application ONLINE ONLINE class01
31
Using CRS with Third Party APPSUsing its own VIP
Creating a new, application specific VIP
[oracle@class01 ~]$ crs_profile -create apache_vip -dir ./ -t application -a\/u01/app/oracle/product/11.1/crs11/bin/usrvip \ -o oi=eth1,ov=192.168.16.110,on=255.255.255.0,ap=0
The ap (active placement) option tells the system not to reevaluate the resource placement in case of new node addition. Our VIP is not connected to particular node. It starts on any node on startup, fails over to any surviving node in case of failure and do not returns back in case if the original node starts again
Setting permitions [root@class01 ~]# ./crs_setperm apache_vip -o root [root@class01 ~]# ./crs_setperm apache_vip -u user:oracle:r-x
32
Using CRS with Third Party APPSUsing its own VIP
Making apache_crs dependent on the new apache_vip. apache_crs is now dependent on ora.class01.vip. To change that [root@class02 oracle]# ./crs_register apache_crs -update -r apache_vip Now apache_crs will follow apache_vip on every node. When apace_vip starts on a node, apache_crs will go at the same node
When apache_vip fails over to ANY surviving node, apache_crs will fail over to the same node
When the failed node starts up again, the apache_vip will not go back to it (active placement) and so will the apache_crs
33
Using CRS with Third Party APPSUsing its own VIP – we got a Service
No particular node. We never know where the application runs, but we always access it at the apache_vip
We need to share binaries
We need to share the configuration files
We need to share everything the application needs to operate, so that each node can access it in the same directory tree
And OCFS is here to help
34
The clusters and the Oracle Universal Installer
OUI supports cluster level installations – installing CRS and Oracle Database on all the cluster nodes simultaneously
Scripts provided under install_directory/install to: runSSHSetup.sh – to set user equivalecy addNode.sh – to add node to an existing cluster – calls OUI attachHome.sh/detachHome.sh to attach/detach existing
homes from the Oracle Inventory
Under install_directory the runcluvfy.sh to check all the prerequisites
35
The clusters and the Oracle Universal Installer
The Oracle Inventory now cares about which cluster members contains particular home directories
36
[oracle@class01 ~]$ ls /u01/app/oraInventory/ContentsXML/comps.xml inventory.xml libs.xml[oracle@class01 ~]$ cat /u01/app/oraInventory/ContentsXML/inventory.xml<?xml version="1.0" standalone="yes" ?><!-- Copyright (c) 1999, 2006, Oracle. All rights reserved. --><!-- Do not modify the contents of this file by hand. --><INVENTORY><VERSION_INFO> <SAVED_WITH>11.1.0.6.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER></VERSION_INFO><HOME_LIST><HOME NAME="OraCrs11g_home" LOC="/u01/app/oracle/product/11.1/crs11" TYPE="O" IDX="1" CRS="true"> <NODE_LIST> <NODE NAME="class01"/> <NODE NAME="class02"/> </NODE_LIST></HOME><HOME NAME="OraDb11g_home1" LOC="/u01/app/oracle/product/11.1/db_1" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="class01"/> <NODE NAME="class02"/> </NODE_LIST></HOME></HOME_LIST></INVENTORY>
37
The clusters and the Oracle Universal Installer – the command line options
[oracle@class01 bin]$ ./runInstaller -help ...... -clusterware oracle.crs,<crs version> Version of Cluster ready services installed.
-addNode For adding node(s) to the installation.Wraped by the addNode.sh
-attachHome For attaching homes to the OUI inventory.Wrapped by attachHome.sh
-detachHome For detaching homes from the OUI inventory without deleting inventory directory inside Oracle home.
38
The clusters and the Oracle Universal Installer – the command line options
-updateNodeList For updating node list for this home in the OUI inventory.Particularly useful when removing node from the cluster
-remoteshell <Path> Unix specific option. Used only for cluster installs, specifies the path to the remote shell program on the local cluster node.
And may more
39
The Bottom Line(or what I like )
CRS looks good, reliable and mature since 10gR2
Now we have complete set of tools to change almost everything in the configuration
Now we can multiplex the OCR and the Voting Disk for better reliability
Now Oracle fully supports adding and removing nodes from the clustrer along with the utilities for that
40
The Bottom Line(or what I don't like )
There are many utilities for management, often duplicating the functionality
It is still easy to mess it up (say mess up the private and the public IPs)
Although possible, reconfiguration (say fixing the problem with the messed up private and public IP) is still quite a pain. Lot of commands, often not very intuitive
Some of the tasks (for example managing the inventory while adding and removing nodes ) have to be done by hand, typing commands, which are sort of “black magic”
There is still what to be done in documenting CRS.