+ All Categories
Home > Documents > Solaris OBP Reference Guide

Solaris OBP Reference Guide

Date post: 08-Apr-2015
Category:
Upload: chafu
View: 2,157 times
Download: 0 times
Share this document with a friend
125
BOSTON DISTRICT T. O. I Hand Book 07/24/06
Transcript
Page 1: Solaris OBP Reference Guide

BOSTON DISTRICT T. O. I

Hand Book 07/24/06

Page 2: Solaris OBP Reference Guide

Disclaimer

This Document is for reference only.

The purpose of this document is to give the SSE a quickreference to a broad amount of material. It is not intendedto replace the original product manuals, and should not beused in place of these manuals or substituted for training onthese products.

This can be best used as tool to get you in the right frame ofmind (product wise) when preparing to go on a call.

Comments, suggestions, and request for updated copies shouldbe sent to:

[email protected]

Copies also available at:

http://webhome.east/boston/toi.html

Page 3: Solaris OBP Reference Guide

Table of contentsDesktop configurations: ........................................................................................................ 1Firmware revision number: ................................................................................................... 1OBP Escape hatches ........................................................................................................... 1nvalias, NVRAMRC ........................................................................................................... 2reset Host ID ..................................................................................................................... 2Boot sequence ................................................................................................................... 2Run Levels ........................................................................................................................ 2Restore Boot Block ........................................................................................................... 2E1000/2000 info ............................................................................................................... 3E series info ..................................................................................................................... 4OBP commands ................................................................................................................ 5OBP device path breakdown ............................................................................................. 6Device tree listing - desktop ............................................................................................... 6E- 450 information ............................................................................................................. 7E- 10000 information ......................................................................................................... 8Blacklist ............................................................................................................................. 10Sysyem Bd power proceedure ............................................................................................ 10E 10k component numbering ............................................................................................... 11Scsi Array Model 100 ........................................................................................................ 12Model 200 Array ............................................................................................................... 13ssaadm commands ............................................................................................................. 13Replace WWN on SSA ................................................................................................... 14A1000 Array ..................................................................................................................... 14D1000 Array .................................................................................................................... 14RSM Disk Tray ................................................................................................................ 15A3000/3500 Array ........................................................................................................... 16A5000 Array .................................................................................................................... 16luxadm commands ............................................................................................................. 17Disk replacment in Veritas ................................................................................................ 18A5000 min configuration ................................................................................................. 18A5000 addressing ............................................................................................................... 19A5000 Target assignments ................................................................................................ 19RDAC ................................................................................................................................ 19Raid Overview .................................................................................................................... 19Raid Levels ....................................................................................................................... 20Boot process ....................................................................................................................... 20Diagnostic commands .......................................................................................................... 21Diagnostic Files ................................................................................................................... 22Watchdog resets .................................................................................................................. 23What to look for on a watchdog reset ................................................................................. 24Dump Analysis ..................................................................................................................... 25abd commands ..................................................................................................................... 26crash commands ................................................................................................................... 27kadb ..................................................................................................................................... 27Sunsolve ............................................................................................................................... 27SunVTS ............................................................................................................................... 28STORtools ........................................................................................................................... 29Explorer Scripts .................................................................................................................... 30Performance Analysis tools ................................................................................................. 31Backup .............................................................................................................................. 32ufsdump .............................................................................................................................. 32ufsrestore ........................................................................................................................... 32tar ...................................................................................................................................... 33cpio ................................................................................................................................... 33dd .................................................................................................................................... 33

Page 4: Solaris OBP Reference Guide

How to get a core dump on a 2.x server ............................................................................ 34Dump device bad when saving core on encapsulated root ................................................ 36Uncompressing Files ........................................................................................................ 39T300 (purple) ...................................................................................................................... 40ACT (A Coredump Tool) ................................................................................................... 44Advantages of Splitting a Drive into Multiple File Systems ............................................... 46

How to configure a system to run on a network ................................................................... 48SEVM - How to recover a primary boot disk ..................................................................... 49Disable DMP .................................................................................................................... 51Memory Scrubber ............................................................................................................... 52Display remote App GUI locally.......................................................................................... 52Cluster 2.x .......................................................................................................................... 53Encapsulating root after using Environmental CD to load O/S .......................................... 56Adding a second network interface ...................................................................................... 56Adding a default gateway ..................................................................................................... 56Volume Manager (general info) ........................................................................................... 57FTPing to and from sunsolve ............................................................................................ 60Serengeti 3800, 4800, 6800 ............................................................................................... 61mounting CDROM without vold ........................................................................................ 67mailx: send files/messages ............................................................................................. 67StarCat 15k notes ................................................................................................. 68local-mac-address .................................................................................................. 73SDS- How to mirror root .............................................................................................. 73IPMP .................................................................................................. 75T3B or T3+ Firmware Rev 2.1 New Functions: ............................................................... 76Hitachi StorEdge 99X0 Arrays: ...................................................................................... 77 SunFire forgotten password ........................................................................................... 78StorEdge Network FC Switch ....................................................................................... 79Hitachi 9900v notes ..................................................................................................... 81Minnow 3300 Array .................................................................................................... 84Tuning ecache scrubber scan rate ..................................................................................... 86VxWorks (serengeti) ........................................................................................................ 86LVD adapter information ................................................................................................. 87Replaceing a nordica bd in a 15K SC .............................................................................. 87Serengetti/15k DR boards ............................................................................................. 87Clean up non-root disc “controler” numbers .................................................................... 88Starcat Portid cheat sheet ................................................................................................. 88Starcat SC clean slate ..................................................................................................... 89Starcat redx info ............................................................................................................ 89StorADE ................................................................................................................... 90Get FRU info from serengetti .......................................................................................... 90Swap ...................................................................................................................... 91Maserati Notes- StorEdge 6320 and 6120 ....................................................................... 92Flash Archive interactive install ..................................................................................... 93UltraSPARC III CPU Diagnostic Monitor (CDM) ......................................................... 94SunFire Service Mode Password Generator ................................................................... 94V440 ALOM, raidctl .................................................................................................... 94Finding Solaris release and distribution loaded .............................................................. 95Find local NIS servers ................................................................................................... 95Network troubleshooting command, files, daemons ..................................................... 96How to find your way around a B1600 ................................................................ 97Cluster 3.x ........................................................................................... 103SMSupgrade 1.4.1 info ........................................................................................... 106Solaris 9 SVM (sds) disk replacement ............................................................................ 107SC rebuild after total disk failure ............................................................................ 10815K DR / hpost examples .......................................................................................... 109

Page 5: Solaris OBP Reference Guide

smsbackup: manually check a backup file: ..................................................................... 1103310/3510 Disk replacement: ........................................................................................... 110

How to mount a CD image file (.iso) as a filesystem: ....................................................... 110Removing the top cover on a V20z .................................................................................. 111 Explorer -w scextended with cron .................................................................................. 111Useful COD commands ..................................................................................................... 111ALOM4v Ontaeri/Erie(Niagra) ........................................................................................... 111Forgotten password (ALOM4v) .......................................................................................... 113Solaris to Linux cross reference .......................................................................................... 113SSH information ............................................................................................................... 114Galaxy ILOM info ............................................................................................................. 115SSH with SMS 1.5 ............................................................................................................. 115

Page 6: Solaris OBP Reference Guide

Desktop Configurations

processor memory sbus slots onboard hosts network

SS4 70, 85, 110 1sim/bank 1 scsi II 10bt/AUI16,32

SS5 70,85,110,170 1sim/bank 3 scsi II 10bt 8/32

SS10 20,30,40,50 1sim/bank 4 scsi II 10bt16/64

SS20 50-150mhz 1sim/bank 4 scsi II 10bt/AUI16/32/64

Ultra 2 167,200,300 2sims/bank 4 fast/wide 10/10016/32/64/128

Ultra 5 270 2sims/bank PCI N/A 10/100can't use 256mb

Ultra 10 d/b 2sims/bank PCI N/A 10/100

1000 ..... 4/group 3 ...... ......

1000e ..... ...... 3 ...... ......

2000 51,61,81 4/group 4 ...... .....8/32 meg

2000e 51,61,81 4/group 4 ...... ......

Commands to find firmware revision number:

#/usr/platform/'uname-i'/sbin/prtdiag -v (gives you a listing of all boards) #/usr/sbin/prtconf -V (gives you a listing of the master boards version) ok .version ok banner

OBP escape hatches

L1-a (stop-a) (Ctrl Break)* To stop a process in OBP or to bring a system down in solaris (not reccomended) L1-f (stop-f) enters command mode on ttya before probing H/W, use 'fexit' to continue with initialization sequence.L1-d (stop-d) Sets diag-switch? parameter to true. Enables verbose output durring post.L1-n (stop-n) Resets NVRAM contents to defaults. (not reccomended. see 'nvrecover')L1 (stop) Runs POST in INIT mode (does not depend on security mode)

* laptop key strokes

Make a new alias... OBP, printenv, nvramrc

1 ok show-disks 2 select a disk controller a,b,c 3 ok nvalias (alias name) (ctl-y) ..... control-y is the yank command, and will give you the path you

selected in the show-disks command. You have to type sd@n,n for Sbus or disk@n for PCI at the end.

Page 1

Page 7: Solaris OBP Reference Guide

To recover NVRAMRC, printenv, veritos

ok nvrecover (ctl-c) ok nvstore

To remove an alias nvramrc, printenv, devalias ok nvunalias (alias name)

To reprogram your MAC address and host ID

ok 17 0 mkp (return)

ok 8 0 20 xx yy zz 080020xxyyzz mkpl (return)(curser disappears)(ctl-d)(ctl-r)

ok banner

Boot Sequence

1 Beep (keyboard) 2 Led's blink, screen goes blank, (POST)

3 Banner 4 Testing memory (selftest#mem) 5 Boot (auto-boot?) 6 diag-switch? 7 prom loads boot block (UFS reader)

Run levels O/S command rc.script

1 single user init 1 /etc/rc1 2 multi-user but no sharing init 2 /etc/rc2 3 multi-user with sharing init 3 /etc/rc3 4 N/A user configurable 5 shutdown and shuts off pwr init 5 /etc/rc5 6 stop and reboot init 6 /etc/rc6 0 goto firmware init 0 /etc/rc0

Restore boot block

# installboot /usr/platform/'uname-i'/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0 |

ex: sun4u

page 2

Page 8: Solaris OBP Reference Guide

Deskside server

Key switch

-standby no power-on normal-diag verbose post, on board, master bd (1000,2000)-secure prevents a (stop-a) and disables reset switch

1000/2000 server Info

1000 40mhz control card1000e 50 mhz control card2000 40 mhz control card2000e 50mhz control card

*auto master- if you replace any CPU/Mem cards put new card in slot 0

Master Board requirements:CPUMemoryLatest firmware rev.

* To determine which board is master:

1 ok print-nvram-stat

2 switch cables to board you want to be master<2> ok 0 switch-cpu

3 make board 0 a master bd

<0> ok set-master-nvram<0> ok print-nvram-stat

4 Get rid of unwanted master <0> ok 2 switch-cpu <2> ok clear-master-nvram <2> ok print-nvram-stat (move rs232 cable to master board)

command to show all sbus cards: <ok> show-devs

NVRAM contents 1000/2000

If you need to change a CPU board, you do not need to do anything with the NVRAM. There is a copy on the control board and it wil be automatically transfered..... If you need to change a control board you must use the proceedure in the FE handbook (pg. cpu81) to invalidate the contents of nvram on the new control board.

Page 3

Page 9: Solaris OBP Reference Guide

Ultra Enterprise 3000 Information

2 power supplies6 cpusI/O board w/sbus, internal scsi adpterclock board, clock, voltage monitor, reset, console (keep firmware)

CPU boards

CPU/mem bd Speeds Processors memory

501-2976 83mhz 167mhz 8 @ 8501-4312 more sram 250mhz 32 @ 8501-4882 83-90-100mhz 333mhz 128 @ 8

400mhz600mhz

I/O boards

I/O type Speed sbus o/b fiber on board host network

1 83mhz 3 soc fas 10/1002 83mhz 2 (upa) soc fas 10/1003 83 and 83/90/100 0 (2pci) n/a ultra wide 10/1004 83 and 83/90/100 3 soc+ fas 10/1005 83/90/100 2upa soc+ f/w scsi 10/100

Clock boards

Clock board numbers Speed

501-2975 83mhz501-4286 83mhz501-4946 83-90-100 (x500 servers)501-5365 83-90-100 (x500 servers, shipped with the E6500)

page 4

Page 10: Solaris OBP Reference Guide

OPB commands:

(OBP reference guide) get this... http\\docs.sun.com

banner a brief decsription of the system. mac address, firmware level, host IDboot -v will verbose boot the system from defaults set in printenv list and devalias file.

boot -a will boot without the use of /etc/system file (interactive boot)boot -s will boot in single user modeboot (alias) will boot the server from the specified alias in the devalias filecd / will put you in a directory hiearchy for listing hardware paths. 'device-end' gets

you out of this modedevalias shows you a listing of your device aliaseslimit-ecache-size will allow you to boot a 400mhz 8meg cache processor on os 2.5.1 or 2.6 CD

solaris 7 works fine. Jumbo patch 105181-14 for 2.6 or 103640-27 for 2.5.1nvalias is used to create an alias

nvunalias is used to remove an alias. see previous example.nvrecover is used to recover a deleted aliasnvstore is used with nvrecover .properties when you are in device hiearchy mode ( cd /) on 3.x systems you can use the

.properties command to see info about the device path you are on. use

.attributes for the same function on 2.x systems.probe-scsi list only internal disksprobe-scsi-all list all scsi devicesprobe-fcal list all photon drivesprintenv used to give you a listing of the environment settingsprom-copy will copy the flash prom from one board to another boards must be the same type.

prom-copy (src dest) ex : ok prom-copy 0 2 will copy flash prom fromboard 0 to board 2

reset will reset the systemsetenv (variable) used to set an envronment setting (variable). use printenv to get setting syntext.set-default will set a line in the environment to default. ex ok set-default output-deviceshow-post-results show results of the last POSTshow-disks will give you a disk controler listing and is used when creating an nvalias.show-devs will give you a listing of all device paths on the system. Use the 'cd /' command

to go down the path.socal-diag-all when you are in device hiearchy mode (cd /), you can go down a socal path

(ex: cd /sbus@3,0/SUNW,socal@0,0). And run OBP diags on that path.show-wwn when you are in device hiearchy mode (cd /), you can go down a socal path

(ex: cd /sbus@3,0/SUNW,socal@0,0). And show the world wide number and loop id .

selftest when you are in device hiearchy mode (cd /), you can go down a socal path(ex: cd /sbus@3,0/SUNW,socal@0,0). And run the socal selftest.

sifting will search for the command specified. ex: ok sifting probe-scsiupdate-proms will update the proms (do not use to copy to cpu board 0, use the prom-copy # #

command)watch -net watch packets and clock ticswatch-net-all watch packets and clock tics words will list all the fourth commands for the current screen.xir-state-all externally Initated Reset command, used to gather info on a hung machine

page 5

Page 11: Solaris OBP Reference Guide

Command to reset the line in the envronment to defaults:

set-default ex: ok set-default output-device

Move an S-buss card from one slot to another:

1. Remove controller (Sbus card)2. boot - r, remove path_to_inst3. boot - ra

You might also be able to switch the Sbus-probe-list order to change the C# in c#t#d#s#.

OBP path breakdown for Enterprise machines

convert to decimal divide by 2 round down sbus slot lun#

| | | /sbus@7,0/SunWfas@3,8800000/sd@1,0

| | result is bd # target#

Device tree listings for desktop machines

4m: ss4, ss5, ss10, ss20/iommu/sbus/cgsix path to monitor card /ledma/lc path to on-board network adapter

/espdma/esp path to on-board scsi devices

4u ultra 1 - 140, 170/sbus@1f,0/ledma@e/le path to on-board network adapter

/espdma/esp path to onboard scsi devices

ultra 1 - 140e, 170e, 200e/sbus@1f,0/hme path to on board network

/fas path to on-board scsi devices

ultra 2/upa/sbus/hme path to on-board network /fas@e path to on-board scsi devices

ultra 5,10 /upa/pci@1f/apb/pci@1,0 path to pci slots 1-3/upa/pci@1f/apb/pci@1,1/ide@3 path to cdrom and disk

/network@1,1 path to on-board network /m64b path to on-board graphics adapter /ebus@1 path to system devices

ultra 30/upa/pci@1f,2000 path to pci slots 1(33/66mhz) - 4 (33mhz)

page 6

Page 12: Solaris OBP Reference Guide

/upa/pci@1f,4000/scsi@3 path to on-board scsi devices/network@1,1 path to on-board network (hme)/ebus@1 path to system devices

ultra 60 /upa/pci@1f,2000 path to pci slots 1(33/66mhz) -4 (33mhz)

/upa/pci@1f,4000/scsi@3 path to internal scsi devices/scsi@3,1 path to external scsi devices/network@1,1 path to network (hme)/ebus@1 path to system devices

acronyms for above listingsesp scsi2 50 pinfas fast and wide scsi 68 pinhme 100mb ethernet

isp Intel Scsi Processorle0 10mb ethernetqe Quad Ethernetqfe Quad fast Ethernetsoc Serial Optical Controlersocal Serial Optical Controler +

Ultra 450 and Ultra Enterprise 450

ok setenv disk_led_assoc add a pci adapter to printenv list to get entries into prtconf so youcan do the following proceedure:

1. To find a drive path on an ultra 450, get the path '/pci@6,40001# - - - - - - - - - - /sd@0,0 from the format command.2. Change the 'sd' to 'disk' and '0,0' to 03. #prtconf -vp | grep 'c#t#d#. . . . . . . . . . . . . /disk@#4. results will be the slot# and the disk# will tell you the drive.

Device tree listing ----- ----- ------ ---- ---- FE Handbook 1 cpu-126 and cpu-128

mfg-options is a NVRAM variable is a decimal value that sets up the system as a workstation or a server. the UE 450 is currently not offered as a workstation.

ok setenv mfg-options 0 (workstation default) Ultra 450ok setenv mfg-options 49 (server default) Ultra Enterprise 450

upa-port-skip-list is a NVRAM variable used to skip probing of upa ports, following upa ports are used:

Prosessors upa ports 0,1,3framebuffers upa ports 1d and 1epsycho upa ports 4,6,1f

ex: ok setenv upa-port-skip-list 3,1d (skips CPU3 and FFB1)

page 7

Page 13: Solaris OBP Reference Guide

obdiag is a command you can run for prom based diagnostics

pcIO-probe-list is an NVRAM variable used to control the probe order for onboard PCI devices (/pci@1f,4000)

pci-slot-skip-list is an NVRAM variable used to skip probing of PCI devices plugged into the backpanel slots

memory-interleave is a NVRAM variable that controls how OBP sets memory interleaving

env-monitor is a NVRAM variable that determins how OBP responds to envronmental monitoring via the l2c serial bus.

.post command displays the results of POST

.asr command displays the system devices and settings

asr-enable , asr-disable commands enable and disables system devices.

/associations The associations tree node contains entries representing catigories of assosiations or connectionsbetween system components that are dispersed in the device tree.

ex: ok cd /associations/slot2devok .properties

ok cd /associations/slot2ledok .properties

ok cd /associations/slot2diskok .properties

E10000

SSP basic commands

hostinfo will give you a status of different parts of the E10k-F fan status, on/off, speed-S signature blocks (board ID)-h processor status-p power status (boards and centerplane)-t temperature status

domain_create requirements: system boards must be present not in use Sufficient memory and at least one proc At least 1 network interface Connection to a disk for OS Unique hostname Entry in host database template eeprom.image file

syntax:Create a new domain:

ssp:domain% domain_create -d domain -b 0 3 4 -o 2.5.1 -p platform

page 8

Page 14: Solaris OBP Reference Guide

Recreate a domain that previously existed (domain_history file)ssp:domain% domain_create -d domain

domain_remove Domain must be haltedsyntax: # ssp:domain% domain_remove -d domain

domain_rename syntax # ssp:domain% domain_rename -d old_name -n new_name

domain_status will tell you which boards are in each domain

domain_switch will change the domain your ssp window is conected to.

domain_history Displays the contents of domain_history file (contains removed domain info)

power no argument will tell you the voltages at each board-on-off

-all = everything except AC sequencers ex: power -on -all-ps = powersupply ex: power -on -ps # (#=0-7)-p = AC sequencer ex: power -on -p # (#=0-4)

-cb = control board ex: power -on -cb # (#=0-1)-sb = system board ex: power -on -sb # (#=0-15)-csb = center plane sprt bd ex: power -on -csb # (#=0-1)

fan no arguments same as hostinfo -F (fan status)-t =tray ex: # fan -t x -p off (x = 0-15)-1 =group of trays ex: # fan -1 x -p off (x=front,rear)-p on all fans on

autoconfig Must be run when adding a new revision of a board to the systemMay also be required when moving a board to a new slotNot required if all boards are the same revision level(Do not run on a system board that is running the OS, or on the centerplane when any domain is running the OS)

board_id will read the serial number eeprom on specified board(has no effect on running domain)

thermcal_config thermcal_config must be run when installing a new boardor moving a board to a new slot, or else temperature sensing for that board will be incorrect.

Target board must be off for 30 miniutes before runningUpdates SSP file with conversion factors from serial eeproms

ssp:domain% edd_cmd -x stopssp:domain% thermcal_configssp:domain% edd_cmd -x start

bringup boot the domain ex: # bringup -A off -l32 will bring system to the <ok> prompt and run hpost at level 32 (7-128)

ex: # bringup will bring up system (autoboot)

netcon start network console session page 9

Page 15: Solaris OBP Reference Guide

Blacklist

- Edit via hostveiw or manually (vi)- Explicit removal of components for isolation of intermittent faults or benchmarking- processors- IO controllers- ASICs- Memory banks- Boards- Busses

- Default location of blacklist file/var/opt/SUNWssp/etc/platform_name/blacklist

- After editing the blacklist file, halt the domain and re-run bringup to make changes take effect. (reboot does not cause hpost to reread the blacklist file)

Hostveiw To remove a device from the blacklist file:- Edit- Blacklist

(change veiw if required)

- MIDDLE click on blacklisted device (should change from black to white)- run bringup to make changes take effect.

Redlist: $SSPVAR/etc/platform_name/redlist is an ASCII file that enables the system administrator or root to restrict, from the SSP, the configuration of the host system. It lists components that POST cannot touch, and whose state POST cannot change. Redlisted components are also considered effectively black- listed. Never use redlisting if blacklisting will do.

System Board Power off Procedure

1. Have the customer bring down all jobs on the domain in question. Next, they need to either use the shutdown command or use the init0 command to bring the system to the <ok> prompt.2. After this has been done, go to the ssp login window. Login as ssp and (ssp password)3. At the SUNW_HOSTNAME prompt, enter either the platform name or the name of the existing domain4. Issue the 'domain_status' command , this will list all the domains and system boards associated with each domain.5. Issue the 'domain_switch (domain name)' command , to get to the proper domain.6. Use the 'power -off -sb #' (#= system board #) command , to power off the system board tobe removed. MAKE SURE THE YELLOW LEDS ARE OFF BEFORE REMOVING BOARD.

7. After completing the work on the system board and the board has been reinstalled, use the 'power -on -sb #' (#=system board#) command, to return the power to the system board.8. Next use the 'bringup' command to autoboot or the 'bringup -A off' to stop at the <ok> prompt.

Page 10

Page 16: Solaris OBP Reference Guide

Component Numbering

Processorscomponent Solaris Hostveiw Post

System Board 0 - 15 SB 0 - 15 sysbd 0- 15 proc. Mod. 0-3 /SUNW,ultraSparc@0,0 00-63 proc0.0 - proc 15.3

| | proc. in hex (0 - 3f ) sysbd#.proc#

I/O ( SBus)

Component Solaris Cable Label PostI/O port 0 - 3 /sbus@40 SB0.0.0 scard 0.0.0

| | | Subtract 40 sysbd#.Sbus#.Slot# sysbd#.SBus#.Slot#

change to decimal divide by 4 answer is board # remainder is SBus #

I/O (PCI)

Component Solaris Cable Label PostI/O port 0 - 3 /PCI@40 PCI0.0.0 scard 0.0.0

| | | Subtract40 sysbd#.PCI#.0 sysbd#.PCI#.0

change to decimal divide by 4 answer is board #

remainder is PCI #

Memory

Component PostSystem board memory mem x.0

| system bd.#.bank#

SSP: (notes)

/etc/netmasks should be: 10.0.0.0 255.255.255.0 (for private net or cb1 will not come up) share cdrom to load VTS share -F nfs -o ro,anon=0 /cdrom/cdrom0/s0

3.4 commands:showfailover: Shows you the failover status showdatasync Shows you the datasync status (from main to spare)setfailover on enables failover

force forces a failover to spareoff disables failover to spare

setdatasync backup backup files to sparessp_backup creates a ssp_backup.cpio file ex: # ssp_backup /var/tmpssp_restore restores ssp_backup.cpio file ex: # ssp_restore /var/tmp/ssp_backup.cpiossp_config float lets you change the hostname for the floating hostname (name should be in the hosts

files of both SSPs and also in /etc/ssphostname on the domains)

Page 11

Page 17: Solaris OBP Reference Guide

SCSI Array

MODEL 100

Front Panel LCD indications:

POST Located in the top left corner. (circle with line at 12:00) indicates post is runningService Under POST icon (wrench). Service is needed, always displayed with another iconController Located to the right of service icon (looks like a se scsi icon). indicates a controller

problemAlpha- POST - test codes and status value of failing test are flashed continuously.numerics Normal operation - Four lsd's of world wide number

Controller errors - Panic code is flashed continuously, and controller icon is onFan Fan failure or heat problemBattery Fast write cache Low NVRAM battery voltage, battery should be replaced.Drive a small solid rectangle represents an avalible drivefibre Fiber optic link state. Two link icons A and B. Switched on when link is

established.

POST codes

01 LCD failure Replace fan tray?08 Fan failure Replace fan tray09 P/S failure Replace Power supply30 Battery failure Replace battery modulexx Controller failure Replace Controller

100/110mhz | Model 11/2

| size of drives

Layout: __ _________POWER SUPPLY_________ | |d0 |d0 |d0 || F |d1 |d1 | d1 || A |d2 |d2 | d2 || N |d3 T0 | d3 T2 | d3 T4 || |d4 |d4 | d4 || T |_________________________________|| R | d0 |d0 |d0__________|____________| A| d1 |d1 |d1 | || Y | d2 T1 |d2 T3 | d2 T5____|_________ || | d3 |d3 | d3 | | || | d4 |d4 | d4 | c0t5d0s?|__|_________________________________|

page 12 Tray 1 Tray 2 Tray 3

Page 18: Solaris OBP Reference Guide

STRIPE Trays...

MIRROR arrays.

*Use channel B first on controller Fiber to copper adapter, 1 port for each host.

** run #ssaadm display cn (where 'n' = controller number)|

this will give you array info on this controller

Solstice Disk Suite: "md" devices... Can change /etc/vfstab and /etc/system to bypass and use raw deviceUse command 'metastat -s' to tie "md' device name in vfstab to physical partition name. # solstice & (will run the GUI)

MODEL 200

The Sparc storage array model 200 is a rack mount disk array controller. Up to six differential SCSI disk trays can be connected to it. Each tray can hold up to six drives. Ports are numbered 0-5 right to left, top to bottom.

controller # drive in tray| |c2t2d0 | port # on controller of array (or tray #, determined by port on array controller)

Connectors and switches:

Fiber optic connector Connects F/O cables from host to arrayScan connector Used to test SSA controller in factory

NVRAM LED Gives info on the SSA NVRAM. Press the NVRAM button when the SSA is off, if the NVRAM LED comes on, then there is data pending on the NVRAM that must be flushed to disk using the fastwrite softwarecommand.

NVRAM button Used to determine if there is any data pending on the SSA NVRAMDIAG switch Used to set the diag level of the SSA. DIAG position for normal

diagnostics. DIAG EXT for extended diagnostics.Reset switch Resets array... Do not press while array is in use.SYS OK LED Gives info on SSA status. Blinking is running normally.( freq=activity)

Off is no power or hung. Solid On is power but hung.

100/200 Array Commands:

# ssaadm release /dev/rdsk/c#t#d#s# To release a specific disk# ssaadm release c# To release all drives on a specific controller# ssaadm stop /dev/rdsk/c#t#d#s# To stop a specific disk # ssaadm stop -t2 c# To stop a specific tray on a specific controller# ssaadm stop c# To stop all drives on a specific controller# ssaadm display c# To display status on all drives on a controller# ssaadm -v download -w ####WWN##### c# To download old wwn to new SSA controler (2.5 and >)

page 13

Page 19: Solaris OBP Reference Guide

Procedure to replace WWN on a SSA

1. Boot from CDROMok boot cdrom -sw

2. Locate the new array controller# ls -l /dev/dsk/c*t0d0s2 | grep NWWN (=new controllers WWN, SSA display)

3. Mount servers '/' filesystem on /a# mount -o ro /dev/dsk/c0t0d0s0 /a

4. Download the old address to the new controller# /a/usr/sbin/ssaadm -v download -w ####WWN##### c#

(old WWN) (c# from step2)5. # halt

6. Press reset on the back of the SSA(if you don't know the original WWN, mount the root filesystem on /a and do a ls-l on /dev/dsk/c#t0d0s2)

A1000 Disk array

- Same disk tray as the D1000- Hardware raid controller- Ultra Differential Fast/Wide host Connection- 8-16 meg Processor Memory (2 simm slots)- 16-64 meg Data cache (2 simm slots)- Battery Backup for data cache- Scsi ID switch on controller- two models 8 HH or 12 low Profile chassis

D1000 Disk tray

D1000 disk tray is used in the Storage Edge A3500 RAID array. 5,8, or 15 D1000's can be used, depending on the configuration. It uses the same disk tray as the A1000, but different controller. It has 2 sets of scsi connectors, you can run 2 scsi busses into it and divide the drives or jumper the busses together and have the array on one buss.

- Does not have a hardware raid controller- 16bit Ultra Differential Fast/Wide Scsi bus- two models 8HH (9.1gb or 18.2gb) or 12 low profile (4.2gb or 9.1gb) - hot plug disk drives- hot plug power and cooling units- dual power cables to seperate sequencers

Scsi Id and Array Id are set on the rear DIP switch ( D1000 can be configured for 1 or 2 busses)

sw1: Disk Array 1 Id: up: drive IDs 8-11 or 8-13, Down: drive IDs 0-3 or 0-5sw2: Disk Array 2 Id: up: drive IDs 8-11 or 8-13, Down: drive IDs 0-3 or 0-5sw3: Drives Remote Start: up wait for scsi command, Down: check sw4sw4: Drives Delay Start: up: Start with delay (id*12), Down: start at power-onsw5: Reserved

Module ID switch (rear): Wheel switch used to ID unit (1-5) when used in an A3500configuration.

Page 14

Page 20: Solaris OBP Reference Guide

Disk Layout: D1000 Array2 | Array1

sw2: down | 0 1 2 3 4 5 | 0 1 2 3 4 5| sw 1: down sw2: up | 8 9 10 11 12 13| 8 9 10 11 12 13| sw 1: up

Front veiw

Leds on back: Color locationPower supply Status led: Normal (green), failureand other p/s is ok (amber) P/SCooling status leds (4): Normal (green), blower failure (amber) fan housingTemp fault: Normal (off), fault (amber) Control bdcontroller power: Normal (green), no power (off) Control bd

RSM Disk Tray

RSM are used in the Storage Edge A3000 RAID array. Each A3000 contains 5 RSM disk trays

- Internally drives operate on a 16-bit Single-Ended Fast/Wide Scsi bus- Externally the tray interface is a 16-bit Differential Fast/Wide Scsi bus- 3 to 7 4.2gb or 9.1gb HH disk drives- hot plug disk drives- hot plug, redundent power and cooling units- dual power cables to seperate sequencers

*** Scsi Id for the tray is set on the I/O board. setting of 0-6 or 8-14, 8-14 is required for the RDAC module.

* Scsi Id for the SEN card is a wheel selection and should be set to 15 (F).

RSM_____front veiw___________

*** | 0 | 1 | 2 | 3 | 4 | 5 | 6 | _________or_____________

| 8 | 9 | 10 | 11 | 12 | 13 | 14 | target IDs

Leds/switches-

Disk leds: Red-fault, Green I/O activity

Panel leds: Power on/off switch Power indicator (green) Power module A and B fault (red) Fan module warning (amber) Fan module falure (red)

Over temp (red) Reset Alarm (pbs)

page 15

Page 21: Solaris OBP Reference Guide

A3000/A3500

A3000- 56 inch rack.- contains 5 RSM disk trays- 1 RDAC Module- each RDAC module has dual hot plug RAID controllers

A3500 - 72 inch rack- contains 5, 7, 15 D1000 disk trays- 1, 2, or 3 RDAC modules- each RDAC module has dual hot plug RAID controllers

# raidutil - c (c#t#d#) - B battery age info for that controllers (A3x00) - R to reset battery age after replacement (A3x00)

Break ,(esc), Q40, ld</Debug, arrayPrintSummary,cfgUnitList,vdShow,dstDevs, rdacMgrSetModeActivePassive, rdacMgrSetModeDualActive,rdacMgrAltCtlFail,rdacMgrAltCtlResetRelease,moduleList,sysReboot

A5000 (photon)

- The A5000 or Photon is a Fiber channel array- up to 14 hh drives or 22 low profile hot pluggable, dual ported FC-AL disk drives

Model #'s A5000 - 14 7200 rpm Drive of 9.1GB eachA5100 - 14 7200 rpm Drives of 18.2GB eachA5200 - 22 10000 rpm Drives of 9.1 GB each

RAID Manager

Commands: # /usr/lib/osa/bin/rm6 to run # /usr/lib/osa/lad will give ctd#s, controller serial #s and lun configurations # fwutil /usr/lib/osa/fw/aaaaaaaaa.apd cxtxdxs0 Downloads appware to a controller (halt all i/0) # fwutil /usr/lib/osa/fw/bbbbbbbb.bwd cxtxdxs0 Downloads bootware to a controller (halt all i/0) # raidutil - c (c#t#d#) - b battery age info for that controllers (A3x00)

- r to reset battery age after replacement (A3x00)

RAID Manager Device Naming Conventions

Target ID of RAID controller | slice | | C# T# D# S# | | | Lun # (created when setting up array)

Host Controller #

page 16

Page 22: Solaris OBP Reference Guide

luxadm commands for the A5000

luxadm probe -p Display information about all attached A5000s. This will give you the enclosure names

luxadm display Use the display subcommand to display enclosure or device specific infoenclosure info ex: # luxadm display mars-0device info ex: # luxadm display mars-0,f3 (f3= front disk slot# 3)

luxadm inq Use the inquiry subcommand to display inquiry info for the enclosure or specific diskenclosure info ex: # luxadm inq mars-0device info ex: # luxadm inq mars-0,f4 (f4=front disk slot#4)

laxadm led_blink Use the led_blink subcommand to start flashing the yellow ledassociated with a specific disk.

ex: # luxadm led_blink mars-0,f2 (f2=front disk slot 2) luxadm led_off Use the led_off subcommand to turn off the yellow LED

associated with a specific disk. ex: # luxadm led_off mars-0,r3 (r3= rear disk slot#3)

luxadm power_off Use the power_off subcommand to set an enclosure or disk to power save mode

enclosure ex: # luxadm power_off mars-0disk ex: # luxadm power_off mars-0,f5 (f5=front disk slot#5)

luxadm power_on Use the power_on subcommand to set a drive or enclosure toits normal power on state.

enclosure ex: # luxadm power_on mars-0disk ex: # luxadm power_on mars-0,f1 (f1=front disk slot#1)

luxadm remove_device Use this subcommand to 'hot remove' a device or enclosure, whenremoving failed disk units for replacement. Verbose output willwalk you thru the proceedure

enclosure ex: # luxadm remove_device mars-0disk only ex: # luxadm remove_device mars-0,f6

luxadm insert_device Use the insert_device subcommand for 'hot' insertion of a new disk orenclosure. Use after the remove_device command to replace a faileddrive with a new one. Verbose output will walk you thru the proceedure. ex: # luxadm insert_device mars-0,f5

luxadm reserve Use the reserve subcommand to reserve the specified disk(s) for exclusiveuse by the host from which the subcommand was issued.

ex: # luxadm reserve mars-0,f6 luxadm release The release command releases the drive from the reserve state

ex: # luxadm release mars-0,f6luxadm enclosure_name Use the enclosure_name subcommand to change the enclosure name of

one or more A5000sex: # luxadm enclosure_name mars1 pluto2 (change from pluto2 to mars1)

luxadm download Use the download command to download a prom image to the FEPROMs on an A5000 interface board. Stop all activity on this connection before downloading firmware, the array will recycleautomatically after the download.ex: # luxadm download -s mars-0 (will download firmware from default file /usr/lib/locale/C/LC_MESSAGES/ibfirmware)ex: # luxadm download -s -f /special/upgrade/ibfirmware.latest

mars-0-f you can specify the file name and do not use the default

page 17

Page 23: Solaris OBP Reference Guide

luxadm fcal_s_download Use the fcal_s_download command to download new fcode into ALLthe FC100-HA sbus cards or display the current versions of the fcodein each FC100-HA Sbus card.display: ex: # luxadm fcal_s_downloaddownload: ex: # luxadm fcal_s_download -f /usr/lib/firmware/fc_s/fcal_s_fcode

Disk failure and replacement Veritas

remove 1. # vxdiskadm2. item 4 (Remove disk for replacement), Enter disk name, Remove another disk? n3. item 11 (Disable (offline)a disk device) offline the same disk so it can be removed, q 4. # vxdctl enable (This will reconfigure DMP)5. # luxadm remove_device mars-0,f0 (mars-0,f0 is enclosure name, diskslot#) return

(physically remove disk drive) (return)replacement 6. # luxadm insert_device mars-0,f0 (mars-0,f0 is enclosure name, diskslot#) return

(physically insert new disk) return7. # vxdctl enable (This will reconfigure DMP)8. #vxdiskadm9. item 5 (Replace a failed or removed disk) Enter disk name, enter c#t#d#, continue y,

replace another? n, quit q10. from here you have a choice of 2 ways to complete this. (most of the time this is up to the customer to do) read both before choosing.

1. make new disk spare and spare disk part of the RAID

# usr/sbin/vxedit -g rootdg set spare=on disk01 # /usr/sbin/vxedit -g rootdg set spare=off disk05

OR2. Take the data from the rebuilt spare and put it back on the new drive Evacuate the spare, disk05 back to disk01 to recover original configuration# /etc/vx/bin/vxevac disk05 disk01

Minimum Configuration A5000

These are minimum disk configurations to insure adequate signal retransmission.

14 disk array The minimum configuration system has drives in slots 3, 6 in front and drives in 0, 3, and 6 in the rear. No other configuration is authorized. As disks are added they should be spaced to minimize gaps between disks.

22 disk array The minimum configuration system has drives in slots 0, 5 in front and drives in 0, 3, 6,and 10 in the rear. No other configuration is authorized. As disks are added they should be spaced to minimize gaps between disks.

Page 18

Page 24: Solaris OBP Reference Guide

A5000 Addressing

"sf" = Host Adapter (socal) has 2 ports sf@0,0 and sf@1,0"ses" = Interface Boards (IB) in the A5000, 2 IBs/array, 2 ports/IB ses 0 and 1 = IB-A"ssd" = disk drives ses 2 and 3 = IB-B

convert to decimal Data path through IB to disk divide by 2 sbus slot 21 = node A

round down d = on bd soc+ 22 = node B lun (always 0) | | | |

sbus@1f,0/SUNW,socal@1,0/sf@1,0/ssd@w2100002037007fa1,0:a| | | |

result is I/O bd # Loop connection WWN# slice a = 0port on the HBA0 = port A1 = port B

A5000 Target ID assignments

(Box ID x 32) + (Backplane# x 16) + (Disk slot#) = Target ID | | | 0,1,2,3 0 front 0-11 left to right

1 rearex: a rear disk slot 5 in a A5000 with box ID of 3 would be (3 x 32) + (1x16) +5 = t117

RDAC Module

- used in the A3000 and A3500 arrays- dual hot plug RAID controllers- Hot plug power and cooling units- Battery backed up data cache- Scsi out must be terminated (UDWIS)- Controller Status leds Pattern will give you error information.- SCSI ID jumpers for both RAID controllers, Default is 5 for top controller and 4 for the lower one

RAID Overveiw

RAID Manager Device Naming Conventions

Target ID of RAID controller | slice | | C# T# D# S# | | | Lun # (created when setting up array)

Host Controller #

page 19

Page 25: Solaris OBP Reference Guide

RAID LEVELS

RAID 0RAID 0 is actually a AID (Array of Interconnected Disks) the R (redundant) part just isn't here. RAID 0 is being able to put multiple physical disks together to make it appear as one large virtual disk. There is no parity drives or parity stripes.

RAID 1RAID 1 is an array that is mirrored. That means there are 2 sets of disks, every disk has acounter part that is an exact copy. If one fails the other will take its place.

RAID 3 RAID 3 has striped data across multiple volumes and a dedicated parity drive. If one of thedrives should fail, it's data can be reconstructed from the parity drive.

RAID 5RAID 5 has striped data across multiple volumes as RAID 3, but also has it's parity striped across multiple volumes. RAID 5 is also able recover from a failed disk.

Boot process

1. VTOC (volume table of contents) Sector 0 of boot disk2. Boot Block Sector 1-15 UFS reader can be rebuilt with the

installboot command.3. UFSboot /platform/'uname-m'/ufsboot Loads standalone

kernel. You can tell it is loaded by the first instance of the spinning wheel (after the memory size post spinning wheel.)

4. genunix /kernel/genunix; generic unix kernel for the operating system; specific only to the O/S release

5. unix /platform/'uname-m'/kernel/unix specific to O/S and archecture type. (you can tell it is loaded by thesecond instance of spinningwheel, at the Sun O/S Release 5.7message).

6. /etc/system has the varibles to custom load kernel parameters.boot -a will not use /etc/system file on boot

7. /etc/inittab sysinit: as we are trying to grab the console.respawn: respawn proc if it diesinitdefault: default run levelwait: wait for job to completePowerfail: on PWR signal run approprite command.

page 20

Page 26: Solaris OBP Reference Guide

Diagnostic commands:

arp Displays Address Resolution Protocol tables.catman -w Create the /usr/share/man/windex database for use with index function available

thru the apropos command. Creates a windex file that includes every solaris command and a brief description.

compare Will tell you the difference between two files ex: compare /kernel /usr/kernelcrash Used to analyse crash dumpsdevlinks Creates symbolic links in /dev using info in /devicesdf -k Displays disk space usage in Kbytes, including free spacedfmounts Display remote filesystem mount info.dfshares Displays shared filesystem info.diff Compare file contentsdisks Creates symbolis links in /dev/dsk and /dev/rdsk, used after the drvconfig commanddrvconfig Configure the /devices directory and the device information tree.eeprom Analyse and change PROM settings.file Determine a file's typefind Search for specific filesformat Analyse or modify partition informationfsck Check UFS filesystems for inconsistenciesfstyp -v Display extensive file system parameters for a specified file system.grep Analyse file contents, and search for specific patterns.groups Display group definitions for a given userifconfig -a Add, display, and analyse the status of network interfacesiostat Analyse I/O performance issuesisainfo - v Will tell you if you are running 32 or 64 bit applicationslast Display history of system login informationls Analyse file propertiesmpstat reports processor stats on a per processor basisndd get and set named device driver parametersnetstat (-i, -r, -k) Analyse network tunning information, including active routes. -i interface info/collisions,

-r router info, -k kernel info pipe to more look for interface, verbose version of -i, newfs Create and examine file system parametersnfsstat Analyse NFS performance informationod Octal dump of a file. ex: od -c /etc/nsswitch.conf will display all charectors in the file

pagesize print the size of a memory page in bytespatchdiag (sunsolve CD) Listing of recommended patchespatchadd -p Displays patches loaded on your system,patchinstall (sunsolve CD) Is used to install patches

(ex: # cd /cdrom/cdrom0)( # ./patchinstall)

backoutpatch (sunsolve CD) Will remove a patch after you cd to that directory(ex: # cd /var/sadm/patch/102044-01)( #./backoutpatch .)

perfmeter Provide graphic display of performance metricsping (-s) Contact network hosts by sending Internet Control Message Protocol (ICMP) request and

reply datagrams.pkgchk check file integrity and accuracy of installation

pkginfo -l Will give you a description of all the packages (w/o pkg name) or one package (w pkg name)

prtdiag Display system configuration and diagnostic information (/usr/platform/ 'uname -m'/sbin)prtconf -v Get system device information from POST probeprtconf -vp Device tree info and PROM version (OBP)

page 21

Page 27: Solaris OBP Reference Guide

Diagnostic commands continued:

prtvtoc List the vtoc (disk label) of a disk drive ex: prtvtoc /dev/rdsk/c0t0d0s0psrinfo -v Will give you processor informationprsadm - f (-n) - f Will allow you to offline a processor. - n will online a specified processor/usr/ucb/ps -aux Lists processes in CP utilization desending order.pwck checks the password file for inconsistencies sar Analyse system performance information (must be initialized in /etc/init.d/perf)showrev -p list currently installed patches; patchadd -p in solaris 2.6 and abovesnoop (-s) display and analyse network trafficstrings Search object and binaryfiles for ASCII stringssysdef Analyse device and software configuration information.swap Add, delete and monitor system swap areassum Calculate and print a checksum value for a named filesys-unconfig Enables you to change information entered during sysidtool phase of installationtail -f Leave file open for reading and display what is theretic Terminfo compiler; translates a terminfo file from source to compiled formattimex List runtime and system activity information during command executiontraceroute Show the route followed by packet transfered in a subnet environmenttruss Trace system calls issued and used by a program or commandtunefs Modify file system parameters that affect layout policiesuname Print platform, architecture, operating system, and system node information.vmstat Analyse memory performance statisticswho am i Display the effective current user name, terminal line and login timexhost hostname allows graphical access to your host from the host specified in hostname

Diagnostic files

/etc/defaultdomain Name of the current domain, read and set at each boot by script /etc/init.d/inetinit/etc/default/cron Determine logging activity for the cron daemon through specificationof the cronlog

variable/etc/default/login Control root logins at the console through specification of the console varible and other

defaults./etc/default/su Determine /etc/hostname.le0 logging activity for the su command thru specification of

the sulog variable/etc/dfs/dfstab List what distributed file systems will be shared at boot time/etc/dfs/sharetab List currently shared NFS file systems/etc/hosts Host file linked to /etc/inet/hosts/etc/hostname.le0 Assign a system name, and through cross-referencing the /etc/hosts file, add an IP address/etc/hostname.hme0 to a particular network interface

/etc/inetd.conf List information for network services that can be invoked by the inetd daemon/etc/inittab Read by init daemon at startup to determine which rc script to execute; also contains

default run level./etc/minor_perm Specifies permissions to be assigned to device files/etc/mnttab Display a list of currently mounted file systems/etc/name_to_major Display a list of configured major device numbers./etc/netconfig Display the network configuration database read durring network initializeation and use/etc/nsswitch.conf List the database configuration file for the name service switch engine./etc/path_to_inst List the contentents of the system device tree using the format of a physical device names

and instance numbers/etc/protocols List known protocols used in conjunction with internet/etc/release O/S release and date/etc/rmtab List the current remotely mounted file systems

page 22

Page 28: Solaris OBP Reference Guide

diagnostic files continued:

/etc/rpc List available RPC programs/etc/services List the well-known networking services and associated port numbers; maintained by NIC/etc/system Tunable Kernel parameters boot -a will boot w/o an /etc/system file/etc/vfstab List local and remote filesystems mounted at boot time./var/adm/messages Lists resent console window and boot messages/var/adm/sulog Display a record for each invocation of the su command/var/adm/utmpx List user and accounting information for the who and login commands/var/adm/wtmpx Maintain history of user information for the accounting packageand report facility./var/crash/hostname Crash files, unix is the symbol lookup file, vmcore is the core dump, bounds is incremental

value for next core set./var/lp/log List print services activity/var/sadm/install/contents List installed software packages/var/sadm/install_data/install_log A listing of the way the install was completed/var/sadm/pkg patch and package information (new O/Ss)/var/sadm/patch patch and package information (old O/Ss)/var/sadm/system/admin/INST_RELEASE List of clusters installed on the system./var/saf/_log List activity of the Service Access Facility (SAF)/var/spool/locks/lck clean up to clear bad tip session (will get error- all ports busy)

Watchdog Resets

CPU Watchdog Reset is initiated on a single processor machine when a trap condition occurs while trapsare disabled and register bit to enable traps is not set. The system tries to come down in a deterministic state and traps to a reserved physical address

System Watchdog Reset is when a fatal error is detected on a multi-processor machine.

obpsym module should be loaded to maximize the amount of symbolic information available in the PROM (obp) environment. Without this module, information is displayed without textual information.

To check if obpsym is loaded:# modinfo | grep obpsym

To load the module from command line:# modload -p misc/obpsym

To load module with each boot, enter the following in /etc/system:forceload: misc/obpsym

obp register commands - sun 4u (used with watchdog reset analysis)

.locals Displays the local CPU registers

.registers Dumps the registers of the current window, those in use at the time of the crash.ctrace Displays a stack trace, listing routines that erer active when the system went down

(obpsym module should be loaded. see above) .pstate Formatted display of the process state register

.ver Formatted display of the version register

.ccr Formatted display o f the ccr or cache control register

.trap-registers Display of trap related registers

page 23

Page 29: Solaris OBP Reference Guide

obp register commands - sun4m (used with watchdog reset analysis)

.locals Displays the local CPU registers

.registers Dumps the registers of the current window, those in use at the time of the crash.ctrace Displays a stack trace, listing routines that erer active when the system went down

(obpsym module should be loaded. see above).psr Formatted display of the process status register.fregisters Display of the floating point registers

What to look for at the OK prompt of a watchdog reset:

Note the number next to the OK prompt, which is the number of the CPU that hit the watchdog reset (multi-processor only)

Note the information in the following fields from OK prompt:.registers- Valid addresses associated with the window registers on

display.locals - Valid addresses associated with the registers on this displaycstrace - pc addresses and routine names .ver - The implementation (IMPL) and (MANUF) manufacturer

numbers..trap-registers- The trap type (TT), the (TSTATE), and the processor state

(PSTATE).pstate - The RED value, which is similar to the ET (enable trap) bit on

SPARC Version 8.

Solaris commands and files that can be used in watchdog reset analysis:

showrev -pprtconf -vpkginfo/usr/ccs/bin/nm /dev/ksyms > symbol_file/usr/platform/sun4u/sbin/prtdiag -v > prtdiag_file/etc/system/var/adm/messages

Related document numbers in the SunSolve database include

1360 - Trouble Shooting Watchdog Resets14133- Is the system crash due to hardware or software14230- System crashes and how to prepare for analysis by Sun Service

page 24

Page 30: Solaris OBP Reference Guide

Dump analysis

****Cores sent in from the customer are located in: /net/eastcores/corefiles/SO# (SO#= SO opened by customer) /net/cores.central/cores/gesd/fidelity/open/SO#

***(STOP A) sync ... on a hung system will cause a core dump.

Three debuggers:

adb: Assembly debugger. It is an interactive and general purpose utility and can be used to examine files, and it provides a controlled enviroment for executing programs. By defaultit does not supply a prompt.

(to run adb on a dump file) #cd /var/crash/host_name # adb -k unix.n vmcore.n

(to run on a live system) #adb -kw /dev/ksyms /dev/mem

What to look for in a core dump with the adb debugger:

$<msgbuf This will give you the:Name or the failing processRegister pointer (rp=)PID (pid=)Program counter (pc=)stack pointer (sp=)thread of failing process (g7)

if no info do this:

To find executing instruction:

1. do a stack trace... $c(this will give you a listing to use in step2)

2. get register pointer, 64 bit system 2nd value from 'die' ex: die (0x9, 0xf05246f4, 0x30, 0x326,... 32 bit system 2nd value from 'trap' ex: trap (0xf028a1d8, 0xf05246f4, ...(use this value in step 3)

3. get values in register 'pc' 0xf05246f4$<regs(use the value under the pc heading for step4)ex: pc fc479dbc

4. use the value in 'pc' to see the executing instructionfc479dbc/ai (it will tell you something like 'ram_write)

page 25

Page 31: Solaris OBP Reference Guide

To find thread involved with panic:

1. panic_thread/x (32 bit systems) panic_thread/k (64bit systems)

(this will print out something like... panic_thread: f5c66480)2. use the thread value to find the 'procp'

f5c66480$<thread(look @ the structure and retrieve 'procp' value)

ex: procp f5c0fcc8

3. Take a look at the process structure to get the process name and arguments (psargs)f5c0fcc8$<proc2u (you should see something in text for process name)

4. You can also use the 'procp' value found in step 2 to get the 'pidp' addressf5c0fcc8$<procpidpf74ccf93

5. Use the 'pidp' adderss from step 4 with the 'pid.print' macrof74ccf93$<pid.print

adb commands

cpu$<cpus Display cpu0 which contains the address of the currently running thread.

cpun $< cpu Display the cpu identified by n

$<msgbuf Display the msgbuf structure, which contains the console messagesleading up to the panic.

$c Display the stack trace

$C Show the call trace, and stack trace leading up to a panic from the bottom up.

$r Display the SPARC window registers, including the program counter and the stack pointer

<sp$<stacktrace Use the sp(stack pointer) address to locate and display a detailed stacktrace

$q Quit adb

$>file Redirect output to file

page 26

Page 32: Solaris OBP Reference Guide

crash similar to adb, but the command interface is different. Crash is used to examine memory of a running or crashed system.

(to run crash on a dump file) # cd /var/crash/host_name # crash vmcore.n unix.n

(to run on a live system) # crash (without any arguments)

crash commands:

u or user will give info on the process that was running when the crash occured

stat will give you the following information: system nameversion informationtime of crashage of systemtype of panic

proc will give you listing of process table

defproc will give you the current process slot number (used with proc command)

defthread will give you the current thread address

kadb is similar to adb. It must be loaded prior to the standalone program it is to debug. To run the kernel under kadb type 'boot kadb' at the ok prompt

iscda is Initial System Crash Dump Analysis... The script is included on the sunsolve CD under the top level directory ISCDA. The following is an example of usage:

# cd /var/crash/machine_name# iscda unix.0 vmcore. 0 > /tmp/iscda.output

This will run the iscda script on the core dump in /var/crash/machine_name. The outputwill go to /tmp/iscda.output. The output will consist of the results from a sequence of adb and crash commands. If needed, you can send this file to the Sun solution center via Email.

SunSolve

The Sunsolve CD is a valuable tool in diagnosing problems. The following are home page selections:

Power search provides a menu driven database selection for searching, Bug reports, FAQ's, Patch descriptions, tech bulletins, Info docs, Symptom and Resolutions

Patch Diag Tool Determines the patch level of your system compared to Sun's reccomendedpatch lists. can be run by cli # patchdiag

Page 27

Page 33: Solaris OBP Reference Guide

Crash Dump Analysis Displays how to load and run the ISCDA script. (Initial System Crash Dump Analysis)

Sun Courier Submits a service request to Sun solution center. (sendmail must be running)

Installing a patch with Sunsolve CD:# cd /cdrom/cdrom0

# ./patchinstall (patch#)Removing a patch with the Sunsolve CD:

# showrev -p (list all patches installed on your system, get name and rev)# find / -name 102044-01 -print (find installed location of patch)# cd /var/sadm/patch/102044-01 (change to patch directory)# ./backoutpatch .# reboot

SUN VTS

Sun VTS is validation test suite. VTS is run at the Solaris level, but should not be run while the customer's applications are up. VTS comes with the Solaris package, there are different revisions for Solaris Releases, rev 2.12 for Solaris 2.6, 3.0 for Solaris 7and 3.4 for Solaris 8. It is reccommended to use the version of VTS that corresponds to the O/Syou are running. Also check sunsolve for related patches.

Installation: (loads to the /opt/SUNWvts directory) (share -F nfs -o ro,anon=0 /cdrom/cdrom0/s0 if ssp)

# cd /cdrom/cdrom0/Product# pkgadd -d . SUNWvts SUNWvtsx SUNWodu SUNWvtsmnor# /cdrom/cdrom0/installer or run thru file manager window

To run: (programs reside in the /opt/SUNWvts/bin directory)

# sunvts - Default graphical interface (CDE) on local machine# sunvts - l Runs Openlook graphical interface on local machine# sunvts - t Runs in tty mode*# sunvts -h host-name Runs graphical interface on local machine

while connecting and testing a remote machine (Sun vts must be loaded on both machines)

* #TERM=vt100; export TERM (use this command when running in tty mode from notebook)*** set_options / Thresholds to 00 ( to log errors and continue )

sunvts -t Navigation: (the <ctl> keys are good if you forgot to set the TERM)

<tab> move between windows<ctl> w move between windows<arrow> move within window<ctl> r move within window on same line<ctl> u move within window up/down lines<ctl> f move within window forward<ctl> b move within window backwards<ctl> l refresh screen <esc> close pop- up menu

<space> select / deselect testPage 28 <enter> select function

Page 34: Solaris OBP Reference Guide

STORtools

STORtools Toolkit simplifies the monitoring and troubleshooting of SunStorEdge A5000, A5100, A5200 disk array instalations. The tool providesan easy to use menu driven front end program with task explanations and help information. Command line utilities are provided for advanced custmizeduse. The utilities have standard man pages for online documentation.

STORtools provides tools for performing the following tasks:

- Revision Checking- Configuration Management- Monitoring and Notification- Troubleshooting and Fault Isolation

To install from CDrom:# pkgadd -d . STORtools

To install after down load from web site:# uncompress STORtools.tar.Z# tar -xvf STORtools.tar# pkgadd -d . STORtools

To run STORtools# /opt/STORtools/bin/stormenu

page 29

Page 35: Solaris OBP Reference Guide

Explorer Scripts:

New Version:

The new version of explorer can be found on Sunsolve under "navigation - diagnostic tools"It is now a software package (SUNWexplo) and can be installed and run (initially) with thepkgadd - d command.

To expand: # zcat SUNWexplo.tar.z | tar xvf -to install: # pkgadd - d . SUNWexplo

Once the package is installed explorer can be run from /opt/SUNWexplo/bin/explorer.

Old Version:The following is documentation sent out with the explorer script. It contains information on how to expand, run and mail the output from the explorer.

1. #su root2. Save the explorer.tar.Z file in directory where root has write permission

3. for encoded files :#uudecode filename#zcat explorer.tar.Z | tar xvf -

4. #./explorer

-While executing this script, you will be prompted to enter information about your site.- If you have internet access, we ask that you enter "y" to the question Would you like to e-mail results [y/n]" so that we get the output automatically.- If you choose not to e-mail the explorer file automatically, please send the resulting file (*.uu) as an attachment to your PTAS account manager.

Explorer in CRON (for this example, explorer will reside in /usr/tmp)

**** Do steps 1-3 above1. # copy file 'explorer.template' to another file (ie: file_name)2. # chmod 755 file_name3. Edit file_name and fill in the appropiate lines. 4. Edit the root crontab file using the 'crontab -e' command and make an entry similar to the following:

00 23 1 * * cd /usr/tmp; /usr/bin/zcat explorer.tar.Z | /usr/bin/tar xvf - ; /usr/tmp/explorer -file /usr/tmp/file_name -mail

5. If you choose not to email the explorer file automatically (-mail option) please send the resulting file (*.uu) as an attachment to your PTAS Account manager.

Note: if crontab -e does not work correctly, try setting the following variable'setenv EDITOR vi'

To veiw the explorer output file

run uudecode on the *.uu file (this will create a host_id.tar.z file)run gunzip on the tar.z file (this will create a host_id.tar file)run tar -xvf on the .tar file (this will expand the file to the explorer output

structure) page 30

Page 36: Solaris OBP Reference Guide

Performance Analysis

Tools: (commands)

timex reports system activity for the execution of a single command-o reports I/O transfers-s reports sar activity during command-h reports 'hog factor'ex: # timex ps -ef (will tell you the amount of time the ps command took to

execute)top display and update information about the top cpu processes

ex: # top 20 (will give you stats on the top 20 processes default is 10)

vmstat reports Virtual memory statisticsex: # vmstat 15 2 (will collect and report virtual memory stats for 15 intervals of

2 seconds)

iostat reports I/O statisticsex: # iostat 60 3 (will collect and report I/O statistics for 3 60 second intervals)

disk thruput test: ( from infodoc 21931)

for write performance: (this will write over data. do not use if data is needed on this disk)# dd if=/dev/zero of=/dev/rdsk/cxtxdxs2 bs=1024k

for read performance: # dd if=/dev/rdsk/cxtxdxs2 of=/dev/null bs=1024k # iostat -pxn 5

mpstat Reports processor statistics per processorex: # mpstat 30 2 (will collect and report proc stats for 30 intervals of 2 seconds)

sar reports overall system activity-u CPU usage data-q average length of run queue-r collect paging dataex: sar -u 60 30 (will collect cpu data for 30 intervals of 60 seconds each) sar -q 60 30 (will collect run queue data for 30 intervals of 60 seconds each) sar -r 60 30 (will collect paging data for 30 intervals of 60 seconds each)

w reports on current system activity per user

page 31

Page 37: Solaris OBP Reference Guide

Backups

ufsdump backs up all files specified by files_to_dump (normally either a whole file system or files within a file system changed after a certain date) to magnetic tape, diskette, or disk file. Filesystems to be backed upmust be inactive (unmounted or single user mode)

0-9 dump level, 0 is full dump. It is relative to what has been backed up. If a level 2 was done then level 4 backup was done the next day.If the next day you did a level 5 all modified files since level 4 would be backed up.... If instead you did a level 3 backup all modified files since the level 2 would be backed up.

c cartridge. Sets the defaults for cartridge instead of the standard half-inch reel.

f Dump file. Use dump_file as the file to dump to, instead of /dev/rmt/0. If dump_file is specified as -, dump to standard output.

u update the dump record. Add an entry to the file /etc/dumpdates.v verify. After each tape or diskette is written, verify the contents

of the media against the source file system.

ex: # ufsdump 0cfu /dev/rmt/0 /dev/rdsk/c0t3d0s0 (full dump of a root file system on c0t3d0 on cartridge tape unit 0)

# usfdump 0uf /dev/rmt/0 /usr (dump the /usr filesystem to tape) # ufsdump 5fuv /dev/rmt/1 /dev/rdsk/c0t3d0s6 (make and verify an

incremental dump at level 5 of the /usr partition of c0t3d0, on tape unit 1

ufsrestore ufsrestore utility restores files from backup media created with the ufsdump command.

i Interactive. After reading in the directory information from themedia, ufsrestore invokes an interactive interface that allowsyou to browse through the dump file's directory hierarchy and

select individual files to be extracted. Valid commands are ls, cd, add, verbose, delete, extract, quit

r Recursive. Restore the entire contents of the media into the current directory (which should be the top-level of the file system).To completely restore a file system, use this function letter to

restore the level 0 dump, and again for each incremental dump.t Table of contents. List each filename that appears on the media.

If no filename argument is given, the root directory is listed.x Extract the named files from the media. If a named file matches

a directory whose contents were written onto the media, and the h modifier is not in effect, the directory is recursively extracted

f Use dump_file instead of /dev/rmt/0 as the file to restore from. Typically dump_file specifies a tape or diskette drive.

ex: # ufsrestore tvf /dev/rmt/0 (list tape contents of /dev/rmt/0) # ufsrestore rvf /dev/rmt/0 (restore contents of tape /dev/rmt/0 to

the current directory you are in) # ufsrestore ivf /dev/rmt/0 (interactive restore of tape /rmt/0)

page 32

Page 38: Solaris OBP Reference Guide

tar Copies and Archives files -c create (backup)-v verbose (details)-f device-t table of contents (list)-x extract-p restore to original mode-h follow symbolic link-d access special files

ex: tar -cvf /dev/rmt/0 /usr (backup /usr to tape /rmt/0) tar -xvf /dev/rmt/0 /usr (restores /usr from tape /rmt/0) tar -tvf /dev/rmt/0 (lists the contents of tape /rmt/0)zcat file_name.tar.Z | tar xvf - (expand a tar.Z file)

cpio copies and archives files-o output -v verbose-i input-t list -d create directories-m retain modification time

ex: # cpio -ov /usr /dev/rmt/0 (copies /usr to /dev/rmt/0) cpio -itv < /dev/rmt/0 (list the contents of /dev/rmt/0) cpio -idmv < /dev/rmt/0 (restores /dev/rmt/0)

dd Device to device copyex: # dd if=ascii_file of=ebcid_file conv=ebcidic (converts an ascii file to ebcidic) # dd if=/dev/rmt/0 of=/dev/rmt/1 (copies from rmt/0 to rmt/1) # dd if=/dev/rdsk/c0t0d0s2 of=/dev/rdsk/c1t0d0s2 bs=512000

(for an quick copy of c0t0d0 on c1t0d0)

page 33

Page 39: Solaris OBP Reference Guide

How is a Coredump Generated?

When a system crashes, it writes a copy of its memory to a temporary location on a disk, usually to the primary swap partition. Savecore is a program which runs at boot time to retrieve the memory copy from the temporary location and to save it to a place where it can be accessed. Savecore must be run during the bootup process, or very shortly thereafter, before it would be overwritten by a running operating system which uses the primary swap partition for other purposes.

How to Get a Coredump from a Solaris 2.x system

Getting a coredump is not enabled by default, because corefiles can bequite large. Enabling a coredump requires the following to be done:

1) Verify that savecore exists. Do the following command:

ls -l /usr/bin/savecore savecore is located in the SUNWtoo package (Programming tools) in 2.X, and is not part of the core install.

If savecore does not exist on a 2.X system, do a pkgadd on SUNWtoo. a) Put the correct OS version installation CDrom in the CDrom drive.

b) Wait until the access lamp goes out in the CDrom drive.

c) # pkgadd -d /cdrom/sol*/s0/Sol* SUNWtoo

d) Answer the questions.

2) Determine how much memory you have on your system. This can be done by:

a) examining your system banner if your system is down by typing "banner" at the "OK" prompt.

b) doing a "wsinfo" on a 2.x system running openwindows, and checking the "physical memory" column.

c) looking at the /var/adm/messages file, or output of the dmesg command, and searching for the line which starts with "mem =". The number which follows will be in bytes. Divide by 1048576 to

get megabytes.

3) Find any locally mounted partition, other than /tmp, which has enough room to hold the coredump. A coredump takes usually about 35% of the size of main RAM memory.

4) Verify that your dump area is at least 35% of the size of main RAM memory. A regular disk is preferedto a meta-filesystem running under Veritas or DiskSuite control. The dump area is usually the primary swap file.

Execute a "swap -l" command and observe the first line with values in it. Take the number in the "blocks" column and divide by 2048. This is the number of megabytes in the primary swap file. Compare this to the size of main RAM memory found in step (2) above.

Page 34

Page 40: Solaris OBP Reference Guide

5) Enable savecore as follows: (Savecore is enabled by default in Solaris 2.7.)

a) Edit /etc/init.d/sysetup, and search for the word "savecore". You will find something similar to

## ## Default is to not do a savecore ##

#if [ ! -d /var/crash/`uname -n` ] #then mkdir -p /var/crash/`uname -n` #fi

#echo 'checking for crash dump...\c ' #savecore /var/crash/`uname -n`

#echo ''

b) Remove the left "#" signs from the bottom 6 statements in (i) above.

c) ( optional if you don't want the core copied to the /var or if /var wasn't large enough) Substitute the name of the partition found in (3)

above for "/var" wherever it shows in the statements in (i) above.

Incidentally, if you know that savecore is enabled but do not know wherethe corefiles are put, checking the "savecore" statement listed abovewill tell you.

Page 35

Page 41: Solaris OBP Reference Guide

Dump device bad when saving core on encapsulated root

Problem: Systems with VxVM encapsulated boot disks will not be able to do system dumps if the swap

slice is not tagged as swap. With the root drive encapsulated, if the system tries to do asystem dump in the event of a panic, it may present messages similar to the following:

panic: <some OS kernel panic message> syncing file systems... done 2084 static and sysmap kernel pages 380 dynamic kernel data pages 385 kernel-pageable pages 0 segkmapkernel pages 0 segvn kernel pages 253 current user process pages 3102 total pages (3102 chunks) dumping to vp fc2f9204, offset 171232 0 total pages, dump device bad <=- The problem! rebooting...

Problem Solution:

If the swap slice was not tagged as swap in format when the rootdrive was encapsulated, the encapsulation process will zero outthe swap slice when it makes the swap volume:

Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 134 100.20MB (135/0/0) 205200 1 unassigned wm 0 0 (0/0/0) 0 2 backup wm 0 - 2732 1.98GB (2733/0/0) 4154160 3 usr wu 825 - 1229 300.59MB (405/0/0) 615600

4 usr wm 1230 - 1667 325.08MB (438/0/0) 665760 5 unassigned wm 0 0 (0/0/0) 0

6 - wu 0 - 2732 1.98GB (2733/0/0) 4154160 7 - wu 135 - 135 0.74MB (1/0/0) 1520

In this example, slice 1 is the swap slice.

When the system dumps, it need to use the physical device and not the swap volume. The dump fails because slice 1 shows a zero size in format.

To solve the dump dev problem, you need to go into format and editslice 1, change the tag to swap, and give it the start and endcylinders.

Page 36

Page 42: Solaris OBP Reference Guide

To get the end cylinder, you need to look in /etc/vx/reconfig.c/disk.d/c?t?d?/vtoc:

# cd /etc/vx/reconfig.d/disk.d/c0t0d0 # more vtoc

#THE PARTITIONING OF /dev/rdsk/c0t0d0s2 IS AS FOLLOWS :

#SLICE TAG FLAGS START SIZE 0 0x2 0x200 0 103360 1 0x0 0x201 103360 611040 2 0x5 0x200 0 4154160 3 0x4 0x200 718960 615600 4 0x7 0x200 1334560 205200 5 0x0 0x200 1539760 410400 6 0x0 0x000 0 0 7 0x0 0x000 0 0

In this example, 611040b is the ending cylinder for slice 1.

In format, select the root drive and edit slice 1:

partition> pCurrent partition table (unnamed):Total disk cylinders available: 2733 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks0 root wm 0 - 67 50.47MB (68/0/0) 1033601 unassigned wm 0 0 (0/0/0) 02 backup wm 0 - 2732 1.98GB (2733/0/0) 41541603 usr wm 473 - 877 300.59MB (405/0/0) 6156004 var wm 878 - 1012 100.20MB (135/0/0) 2052005 unassigned wm 0 0 (0/0/0) 06 - wu 0 - 2732 1.98GB (2733/0/0) 41541607 - wu 2732 - 2732 0.74MB (1/0/0) 1520

partition> 1

Part Tag Flag Cylinders Size Blocks 1 unassigned wm 0 0 (0/0/0) 0

Enter partition id tag[unassigned]: swapEnter partition permission flags[wm]: Enter new starting cyl[0]: 68Enter partition size[0b, 0c, 0.00mb]: 611040b <==from vtoc filepartition> lReady to label disk, continue? y

partition> p

Current partition table (unnamed):Total disk cylinders available: 2733 + 2 (reserved cylinders)

page 37

Page 43: Solaris OBP Reference Guide

Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 67 50.47MB (68/0/0) 103360 1 swap wm 68 - 469 298.36MB (402/0/0) 611040 2 backup wm 0 - 2732 1.98GB (2733/0/0) 4154160 3 usr wm 473 - 877 300.59MB (405/0/0) 615600 4 var wm 878 - 1012 100.20MB (135/0/0) 205200 5 unassigned wm 0 0 (0/0/0) 0 6 - wu 0 - 2732 1.98GB (2733/0/0) 4154160 7 - wu 2732 - 2732 0.74MB (1/0/0) 1520

partition> q

page 38

Page 44: Solaris OBP Reference Guide

Uncompressing Files:

What to use to uncompress files:

Use the 'file (file_name)' command to determine what type of compression was used.Ex: # file 2.6_x86_Recommended.tar.gz 2.6_x86_Recommended.tar.gz:

gzip compressed data - deflate method , original file name

*.tar.Z files use the 'zcat (file_name.tar.Z) | tar xvf -' commandEx: # zcat explorer.v.3.1.0.tar.Z | tar xvf -

*.tar.gz files use the 'gzcat (file_name.tar.gz) | tar xvf -' commandEx: # gzcat 2.6_x86_Recommended.tar.gz | tar xvf -

you can also use the 'gunzip' command but that will result in a *.tar file andyou will have to use the 'tar - xvf (file_name.tar)' command to expand it

*.tar.z files copy to *.tar.Z and use zcat (see above)

*.zip files use the 'unzip (file_name.zip)' command Ex: # unzip stuff.zip

*.tar files use the 'tar -xvf (file_name.tar)' commandEx: # tar -xvf 2.6_x86_Recommended.tar

*****zcat can be found on most versions of Solaris in /usr/bin****** gzcat can be found on the web and Sunsolve CD gunzip or gzunzip can be found in /usr/dist/exe on the corporate network tar can be found on most versions of Solaris in /usr/bin

unzip can be found in /usr/dist/local/exe on the corporate network

****NOTE: It is a good idea (due to the locations of these commands) to have them on a floppy or CD that you can bring on-site. *****

page 39

Page 45: Solaris OBP Reference Guide

T300 (purple): Also see page 67

Description:

The T300 array is a hardware RAID FCAL device. As such please make sure all firmware and patches are up to date. You can use STORtools* to exercise and troubleshoot the product.The T300 also has a com (rs232) port so you can tip into it and a ethernet port so you canuse telnet, ftp, tftp boot, or administer it through Component Manager. The T300 has an EP (extended Prom) boot that runs post and has its own set of commands and also runs a limited function unix O/S called PSOS, (accessed thru tip or telnet). PSOScan be run from the reserved area on the array drives or tftp can be used to load it fromthe server.

*STORtools will only test to the MIA on the T300 product line.Partner group

Two T310s cabled together through the UICs. The cables coming from the 2 dot (OUT ..)ports on the UIC designate the primary array. The other array (uic 1 dot IN ) becomes the secondary array. Only 2 T300s can be in a partner group at this time. In a partnered groupwith 2 fiber paths, the server will access the LUNs thru both paths, top array LUNsthru top array controller and bottom array LUNs thru bottom array controller. If something happens to one of the controller then the LUNs will failover to the remaining controller.

Tray ID #s (fru stat, fru list)

u# = unit right now valid numbers are u1 and u2u1d3 = unit 1 disk 3u2pcu1 = unit 2 power cooling unit 1u1l1 = unit 1 loop 1 (uic1)u2ctr1 = unit 2 raid controller 1

Default array login: :/:> root (return) no password

Default Configuration: 1 LUN RAID 5

Chassis Model number history:

p1.7 Darker gray, 2 fiber data ports on raid controller bd.p1.8 Single fiber data port and HH 1.6" seagate drives p1.9 Single fiber data port and LP (1.0") drivesp2.0 Redesigned chassis called "barney" (have not seen yet 3/12/00)

Hot pluggable FRUs:

PCU (Power Cooling Unit) battery good only 2 years, messages in syslog 45 days prior to expiration once PCU is unplugged you have 30 min to change before array startsa shutdown sequence. Array requires 3 fans to stay below critical temp.

UIC (Unit Interconnect Controller) verify status thru fru stat. Once UIC is removed you have 30 min to change before array starts a shutdown sequence.

Raid Controller is only redundant in a partner group. Also needs to have some type of DMPrunning (veritas) to fail over and have the server be able to access the disks on the failed array.

page 40

Page 46: Solaris OBP Reference Guide

T300 (continued) Also see page 67

Disk(s) Numbered 1 - 9 left to right while facing front of array. Pull disk out ( use spring loaded latch handle) one inch, wait 30 seconds then remove from array. Once Disk drive is removed you have 30 min to change before array starts a shutdown sequence.

MIA Media Interface Adapter (fiber to copper connection) is only redundant in a partner group. Also needs to have some type of DMP running (veritas) to fail over and have the server be able to access the disks on the failed array.

LEDs: ( in general, for specific info see pg 6-9 & 6-15 install and admin manual)

Solid | Blinking Green: normal status | system activity Amber: Fru is being initialized | Fru failure (controller, uic, pcu, disk)

Path:

Sbus controller # (hba) | C#T#D#S# | | |_ Slice | |____ T300 volume number (LUN) (use 'port listmap' command) |_______ Target ID of array ( use 'port list' 'port set' commands)

****Use format, scsi, inquiry, mode bytes, 10 = primary path 30 = secondary path ********You will cause a LUN failover if you try to access the secondary path LUNSs through *********low level commands like format and dd in a partner group*****

convert to decimal divide by 2 Volume on array

round down sbus slot LUN (port listmap) | | |

sbus@1f,0/SUNW,socal@1,0/sf@1,0/ssd@w50020f2300000a06,1:a| | | |

result is I/O bd # Loop connection WWN# slice a = 0d = on board soc+ port on the HBA last 6 digits

0 = port A are from 1 = port B mac address

(set command)

T300 Boot:

- Eprom: T300 EP boot (1st stage) POST

- U1d1 (will try to get PSOS from U1d1- d9 or TFTP if set bootmode tftp)- PSOS boot (T300 Release x.x) (2nd stage) - POST - Mount filesystems - Load daemons - Login prompt

page 41

Page 47: Solaris OBP Reference Guide

T300 (continued) Also see page 67

TFTP BOOT: (if chasis is swapped enter new mac address into /etc/ethers file of tftp server)

On Server:1. Modify /etc/hosts file on server with ip and name of array2. Modify (create) /etc/ethers file on server with mac and array name 3. Create /tftpboot directory and copy nbxxx.bin (psos) to it4. Un comment '#tftp' in /etc/inetd.conf5. kill -HUP inetd PID# 6. ps -ef | grep in.rarpd (should be running... restart if tftp doesn't work)

On Array:7. Modify Bootmode to tftp (:/:set bootmode tftp)8. Modify tftphost to server's IP (:/: set tftphost xxx.xxx.xxx.xxx)9. Modify tftpfile to nbxxx.bin (step#3) (:/: set tftpfile nbxxx.bin)10. Modify IP to ip assigned to your array ** (:/: set ip xxx.xxx.xxx.xxx)11. Reset array

** if rarp is working, array should get IP from server, If IP is assigned thru "set" command than array will go to the 'who is tftphost' phase of tftpboot.

Add a volume (lun) to a array: (:/: sys blocksize (n)k should be set to correct value before 'vol add')

vol add vol_name data u#d#-# raid # standby* u#d9 vol init vol_name data rate(1-16) vol mount vol_namevol stat*vol list*vol mode*(Note: if t3b and volslice is enabled, you must create a slice to see lun in format- pg. 76) *optional

T300 useful commands: (use the 'help' command to get specific switches)

File management:mkdir, rmdir, cd, pwd, touch, cat, more*, tail ,rm, mv, telnet, ftp**

*more command use q=quit, f= forward, b= backward** ftp requires a password on the root account

vol commands:vol list, vol add, vol remove, vol init, vol mount, vol unmount, vol mode,vol verify, vol stat.

boot Boot system (-i, -s,)disable Disable controller (u1,u2) or loop cards (ux lx)disk Disk administration (version)date set date and time (200003071607 = 03/07/2000 16:07)enable Enable controller (u1,u2) or loop cards (ux lx)ep Program the flash promfru Display FRU information (-s , -st, list, stat,) help Display reference Manual pagesid Display fru identification summarylpc Get interconnect card property (ledtest)

page 42

Page 48: Solaris OBP Reference Guide

T300 (continued) Also see page 67

passwd change or display array passwordport configure the interface port number (list, listmap, set)proc Display status of outstanding vol processes (list, kill)refresh Start/stop battery refreshing or display it's statusreset Reset systemset Display or modify the set informationshutdown Shutdown disk tray or partner groupsys Display or modify the system information (list) (*mp_support to rw for dmp)tzset set the time zonever Display the software versionvol Display or modify volume information

Firmware upgrading: (strongly recommended to have array "out of use" before upgrading firmware. This includes disable polling from Component manager)

FTP firmware files to / on the array. At this moment the files can be found at http://icode.ebay but in the future they will be available on sunsolve Patch 109115.xx.

Raid controller firmware upgrade: :/:> boot -i nb###.bin:/:> reset -y (Warning: if base firmware was below 1.17a, use serial port to reset)

EEprom upgrade::/:> ep download ep2_09.bin

UIC upgrade::/:> lpc download u#l# lpc_04.11 (3minutes/card, will take card off line)

Disk upgrade: (unmount volumes, 20min for 9 disks, led goes amber during download):/:> disk download u1d1-9 D44a.lod

Useful Array files:

/syslog Array error log file, 1Meg in size. Then gets copied to .old/syslog.old backup to syslog/etc/syslog.conf Configures where to send error messages

Comm port wiring for notebook: ( it works I verified it)

RJ11 to DB9 or DB25 1 grd --------------- 5 grd--------------7 grd 5 RXD -------------- 3 TXD-------------2 TXD 6 TXD -------------- 2 RXD-------------3 RXD

123456

Useful web sites:http://icode.ebay Firmware http://ISI.com PSOS o/s informationhttp://thedance.ebay/hardware/arrays/purple/hardware.html White papers and documentation

page 43

Page 49: Solaris OBP Reference Guide

ACT ( A Crashdump Tool)

ACT is a tool that can be run against a core dump or live system. It generates a report that gives you server state information based on the core. ACT should be run on the server that panicked or shouldat least be run on a server that has the same O/S version as the core that is being analysed. The engineers that maintain ACT recommend you give it to your customers and have them install it ontheir servers. When a core dump is produced they can run it on the core and forward the outputto the solution center, because it is much smaller than the core it will save time in transmission. Act is supposed to become the standard output that all centers will accept.

Available at Http://cte-www.uk It is in *.gz format. To expand it:

# gunzip CTEact.tar.gz(this will create a CTEact.tar file)# tar -xvf CTEact.tar(this will explode the CETact directory)# pkgadd -d . CTEact(will install the package into /opt/CTEact)(answer install questions, I selected 'n' for mailout option)

(executable is /opt/CTEact/bin/act) Examples: # ./act -l (output on live server to screen) # ./act -l -s /tmp/dir/ (output from live server to seperate files)

# ./act -d /var/crash/hostname/vmcore.0 -s /tmp/dir/ (output core file to seperate files in /tmp/dir)# ./act -d /var/crash/hostname/vmcore.0 > /tmp/act_out (output core

file to file /tmp/act_out) ****** Info from our website ******

ACT is a tool developed over several years to aid in the process ofanalysing kernel dumps. It attempts to perform a good first pass on akernel dump.

ACT prints detailed and accurate information about: - Where the kernel panicked - A complete list of threads on the system. - The contents of the /etc/system file which was read when the failed system booted - A list of kernel modules that were loaded at the time of the panic. - The output of the kernel message buffer - Full deadlock detection relating to threads blocked on mutexes or readers/writer locks. - Threads blocked in either getblk() or biowait(). ACT was conceived and developed by Steve Cumming, while working for what was SunService and then while working for SMCC European CTE. After a short illness Steve died on July 12th 1998.

ACT is under continuous development by members of Computer Systems European CTE group based in Bagshot, UK. page 44

Page 50: Solaris OBP Reference Guide

Installation

ACT now resides in package format for both x86 and sparc,so pkgadd should be used for installation. To check on the current version click Here.

By installing one of the packages below ACT will be installed for the appropriate architecture and version of Solaris you are running and a new RC script will be installed which will configure savecore and run ACT against the newly generated crash dump upon system reboot.

CTEactx.tar.gz. ACT for X86

CTEact.tar.gz. ACT for SPARC

Or alternatively if you have KENV installed then you can tar the following over kenv in order to update Kenv with the latest version ACT.

KENVact.tar.gz. ACT for KENV.

Instructions

ACT takes the following options, options may appear in any order :

-d corefile ACT assumes that the file corefile contains the kernel core image. This file could be /dev/mem if you want ACT to analyze the running system.

-l Should be used when running act on a live system.

-n namelist ACT assumes that the file namelist contains a valid kernel namelist. This file could be /dev/ksyms if you want ACT to analyze the running system.

-s directory Tells act to split its output into several files writing the data into the directory specified to aid readability. The files created are,the names speak for themselves:- biowait getblk modules msgbuf mutex rwlock threads system summary sunsolve

-u Displays stack information in an alternate form

-z This informs ACT to display timezone information in localtime rather than GMT

page 45

Page 51: Solaris OBP Reference Guide

Advantages of Splitting a Drive into Multiple File Systems (info doc 14622)

Rather than using an entire disk drive for one file system, which may lead to inefficiencies and other problems, you can split a single drive into sections. The sections are called slices, aseach is a slice of the disk's capacity. Once the partition has been allocated, it becomes the a logical disk drive. A disk can be split into eight subdisks. The splitting of the disk is often called partitioning or labeling of the disk drive. Below is an example:

Current partition table (original):Total disk cylinders available: 2036 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 1872 921.87MB (1873/0/0) 1887984 1 unassigned wm 0 0 (0/0/0) 0 2 backup wm 0 - 2035 1002.09MB (2036/0/0) 2052288 3 unassigned wm 1873 - 2035 80.23MB (163/0/0) 164304 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0

partition>

Here are some of the reasons for multiple filesystems on one hard drive.

1. Damage Control: If the system were to crash due to software error, hardware failure, or power problems, some of the disk blocks might still be in the file system cache and not have been written to disk yet. This can cause damage to the filesystem structure. While the methods used try to reduce this damage, and the FSCK utility can repair most of the damage, spreading the files across multiple filesystems minimizes the possibility of damage, especially to those files that are needed during boot-up. When the files are split up across the disk slices, critical files end up on slices that rarely change or are mounted read-only and never change. The chances of them being damaged and preventing you from recovering the remainder

of the system are greatly reduced.

2. Access Control: Only complete slices can be marked as read-only or read-write. If you desire to mount the shared Operating System sections as read-only to prevent changes, they have to be on their own slice.

3. Space Management: Files are used from a reserve of free space on a per-file system basis. If, for example, a user has allocated a large amount of space, depleting the free space, and the entire system disk were a single filesystem, there would be no free space left for critical system files. The entire system would freeze when it ran out of space. Using separate filesystems, especially for user files, allows only that a single user, or group of users, to be inconvenienced when filesystem becomes full. The system will continue to operate, allowing the System Administrator to handle the problem. The exception to the above scenario is the root filesystem.

4. Performance: The larger the filesystem, the larger the tables that must be managed. As the disk fragments and space become scarce, the further apart the fragments of a file might be placed on the disk. Using multiple (smaller) partitions reduces the absolute distance and keeps the sizes of the tables manageable. Although the UFS file filesystem does not suffer page 46

Page 52: Solaris OBP Reference Guide

Advantages of Splitting a Drive into Multiple File Systems (cont.)

from table size an fragmentation problems as much as System V file systems, this is still a concern.

5. Backups: Many of the back-up utilities, such as "ufsdump" work on a complete filesystem basis. If a filesystem is large, it could take longer than you want to allocate to back-up. Most importantly,

multiple smaller backups are easier to handle and recover from.

Below is a listing of slices, some that are required, root and swap, and the recommended additionalslices such as usr, var, opt, home and tmp.

1. The root slice: The root slice is mounted at the top of the filesystem hierarchy. It is mounted automatically as the system boots, and cannot be unmounted. All other file systems are mounted below the root.

The root filesystem needs to be large enough to hold the following: * The boot information and the bootable kernel (kernel/genunix), and a backup of the kernel just in case the main one gets damaged. * Any local system configuration files, which typically reside in the /etc directory. * Any stand-alone programs, such as diagnostics, that may be run instead of the OS.

The root partition typically runs on between 15 and 30mb. It is usually placed on the first slice of the disk, or more commonly know as slice 0 or a.

2. The swap slice: The default rule is that there is twice as much swap space as there is RAM installed on the system. For example, if you have 16mb of ram, the swap space would need

to be 32mb. Although this is just a preliminary template as to how much swap to use, their are other factors to consider, an example would be if a users system is running large applications that use large amounts of data, such as a CAD application. You can monitor the amount of swap space used via the pstat or swap commands. If you did not allow enough swap space during the initial install you can add additional swap with either the swapon or swap commands.

3. The usr slice: The usr slice holds the remainder of the operating system utilities. It needs to be large enough to hold all the packages you chose to install when installing the OS. If you are going to

install local applications or third-party applications in this slice, it needs to be large enough to hold them. It is generally better if the usr slice contains the operating system and only symbolic links to the applications. The filesystem is often mounted read-only to prevent changes.

4. The var slice: The var slice holds the spool directories used to queue printer files and mail, as well as log files that my be unique to the system. It also holds the /var/tmp directory, which is used for

larger temporary files. It is the read-write counterpart to the usr slice. Every system, even a diskless client, needs it's own var filesystem. It is not a filesystem that can be shared with any other system(s).

5. The opt slice: In the newer UNIX systems based on System V release 4 (Solaris 2.x) many sections are now optional and no longer needed to be loaded on the /usr filesystem. They are now installed onto the /opt filesystem. Additional add on packages are also installed in this filesystem.

6. The home or export home (remote users) slice: The home directory is where the user's login directoriesare placed. Making home its own slice prevents users from hurting anything else if they run this filesystem out of space. A good starting point for the size of this slice is 1mb per application user plus 5mb per power user and 10mb per developer you intend to support.

Page 47

Page 53: Solaris OBP Reference Guide

Advantages of Splitting a Drive into Multiple File Systems (cont):

These are rough estimates and are to be only used as a guideline, your configuration may need more or less space per user. Usually this is /export/home. Don't put things into /home, as this is a reserved mount point for automounted NFS filesystems. It's fine to use when automounter is turned off, but it is on by default.

7. The tmp slice: Large temporary files are placed in the /var/tmp but sufficient temporary files are placed in /tmp. The files in the /tmp directory are very short-lived and are cleared out during a reboot of

the system. If users run mostly application based programs 5 to 10mb should be sufficient for this slice. If developers are the primary users of the system 10 to 20mb may be needed. Once again these numbers or only a guideline, your needs may be different.

How to configure a system to run on a network (info doc 14981) (also see pg 56 Adding a 2nd network interface)

1. /etc/hosts This file is used to resolve host name into IP addresses. This file must be updated if no naming

service is being used. This file should contain the IP and host name of each system on the local network, including any gateways or routers.

Example: 127.0.0.1 localhost 129.145.71.109 kishori loghost #this is the IP and host name for the local machine 129.145.71.110 sage #this is the IP and host name for a host on the network

2. # ifconfig -a Be sure that both the loopback and network interface are up and running.

Example: lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232 inet 127.0.0.1 netmask ff000000 le0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500 inet 129.145.71.109 netmask ffffff00 broadcast 129.145.71.255

If the interface to the network is not up and running do the following:

# ifconfig le0 plumb NOTE: The default may be hme0 (for most Ultra machines)

3. /etc/netmasks This file should contain the netmasks. If you are using the default netmasks and it appears in ifconfig -a, this file is not necessary.

Example: # The netmasks file associates Internet Protocol (IP) address

# masks with IP network numbers. # # network-number netmask # # Both the network-number and the netmasks are specified in # "decimal dot" notation, e.g: # 128.32.0.0 255.255.255.0 # 129.145.0.0 255.255.255.0

page 48

Page 54: Solaris OBP Reference Guide

How to configure a system to run on a network(cont.):

4. /etc/defaultrouter If you want to define a default router include the router name in this file.

5. /etc/hostname.le0 or /etc/hostname.hme0 (depending on you interface type) This file should contain the name of the local host.

6. /etc/resolv.conf If you are using dns this file should contain the name of the domain and the IP address of the nameserver. It is acceptable to list more than one nameserver (up to 4). The nameservers will be consulted in the order listed. Be careful this file is very sensitive to extra spaces and tabs.

Example: domain support.Corp.Sun.Com nameserver 129.150.254.2

7. /etc/nsswitch.conf Check this file for the appropriate entries. If a naming service is being used this file should reflect that.

8. It is a good idea to reboot the system at this point. Check to see if the network is working by pinging other machines both inside and outside of your network.

SEVM - How to recover a primary boot disk. (info doc 14820)

NOTE: This document was written for VxVM 2.x. New functionality in VxVM 3.x renders many of the "extra steps" in replacing a primary root disk obsolete. See the comments interspersed below regarding steps when using VxVM 3.x.

If Volume Manager (VxVM) is running on a system with the root disk encapsulated and mirrored, and the root disk fails, the system stays up and running, due to the fact that it is mirrored, but how can you recover the original root disk?

First, some terminology:

The 'primary' root disk is the system disk on which the OS was originally installed. This disk was "encapsulated" into VxVM and then mirrored. Since this disk is encapsulated, there is a

direct mapping of partitions onto volumes for /, swap, /usr, and /var.

The 'secondary' root disk is a disk which was first initialized into VxVM and then used to form a mirror for the primary root disk.

VxVM 2.x: Since it was initialized, rather than encapsulated, there is no mapping of partitions onto the volumes /, swap, /usr, and /var. VxVM 3.x: When the mirror of the root disk is created, the mapping

of partitions onto the volumes /, swap, /usr, and /var is maintained.

Page 49

Page 55: Solaris OBP Reference Guide

SEVM - How to recover a primary boot disk. (cont.)

RECOVERING THE 'SECONDARY' BOOT DISK:

If the 'secondary' system disk fails, the replacement of the disk is straightforward. It is handled in the same manner that any other failed drive needs to be replaced.

The easiest way to do this is to run 'vxdiskadm' and choose option #4 (Remove a disk for replacement). Then, shut down the system (if necessary) to physically replace the disk, and reboot.

Run 'vxdiskadm' again, this time choosing option #5 (Replace a failed or removed disk). When asked to 'encapsulate' the disk, reply "no", and then reply "yes" when asked if you wish to initialize it.

This will begin recovery of the disk and the mirrors will resync automatically.

RECOVERING THE 'PRIMARY' BOOT DISK:

NOTE: If you are running Volume Manager version 3.x.x or above, it is not necessary to follow the steps below. Instead, the process for replacing the 'primary' boot disk is EXACTLY the same as that

for the 'secondary' boot disk, which is shown above. The reason for this is because Volume Manager 3.x automatically creates the underlying "hard" partitions for /usr and /var on the replacement disk, whereas older versions did not.

If you are using Volume Manager 2.x, continue on:

The recovery of the 'primary' boot disk contains a few additional steps because the procedure must reestablish the direct mapping between the partitions on the disk and the system volumes. This is

necessary so that the system can be changed back to use underlying devices, should this benecessary (for example, to perform a system upgrade or boot from cdrom to fsck one of these filesystems).

1.Run 'vxdiskadm' and choose option #4 (Remove a disk for replacement). Then, shut down the system

(if necessary) to physically replace the disk, and reboot.

2. Run 'vxdiskadm' and choose option #5 (Replace a failed or removed disk). When asked to 'encapsulate' the disk, reply "no", and then reply "yes" when asked if you wish to initialize it.

3.This step will change depending on the number of partitions on the boot disk. The 'vxdiskadm' command will put back partition 0 (for /) automatically, and may also do this for swap. However,

if you have any additional volumes on that disk (i.e., /usr or /var), you will have to run a command to put the partition on the new disk in the correct location.

Examine the partitions on the replaced disk by running 'format' or 'prtvtoc' on it. At the very least, you will see a partition for root and one for the public and one for the private partitions for VxVM. Determine if any partitions are missing. If so, these "missing" partitions can be recreated easily using the steps below.

The command to use is 'vxmksdpart'. You give this command the name of a particular subdisk, and it creates a partition on the disk in the correct location. The syntax is:

/etc/vx/bin/vxmksdpart <subdisk> <partition> <tag> <flags>

Page 50

Page 56: Solaris OBP Reference Guide

SEVM - How to recover a primary boot disk. (cont.)

For example, if you have a subdisk named "disk01-02" and wanted to create partition 7 on the disk to map this subdisk, you can run

/etc/vx/bin/vxmksdpart disk01-02 7 0x00 0x00

3a. SWAP. To create a partition for the swap volume, run:

/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x03 0x01

where <subdisk> is the name of the subdisk used in the swapvol volume on the primary boot disk (for example, "rootdisk-01"), and <partition> is the unused partition to use for swap (for example, "1"). The "0x03" tag specifies this partition is for 'swap'.

3b. USR. To create a partiton for /usr (if this disk contains /usr), run:

/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x04 0x00

3c. VAR. To create a partiton for /var (if this disk contains /var), run:

/etc/vx/bin/vxmksdpart -g rootdg <subdisk> <partition> 0x07 0x00

There is no reason to create any other partitions on the boot disk.

Disable DMP

Note: Be sure to do these steps first: 1. umount all file systems created on Volume Manager volumes 2. Stop the Volume Manager (vxdctl stop).

1. remove the "vxdmp" driver from the "/kernel/drv" directory rm /kernel/drv/vxdmp 2. edit /etc/system, and remove the line: forceload: drv/vxdmp 3. Remove the Volume Manager DMP files: rm -rf /dev/vx/dmp /dev/vx/rdmp 4. symbolically link /dev/vx/dmp to /dev/dsk ln -s /dev/dsk /dev/vx/dmp 5. symbolically link /dev/vx/rdmp to /dev/rdsk ln -s /dev/rdsk /dev/vx/rdmp 6. shut down the system to disable the DMP functionality 7. reboot

Patch 105181-20 not loading... Check for 106125, 106292, 106361-08

page 51

Page 57: Solaris OBP Reference Guide

Memory Scrubbing

On Ultra Enterprise (sun4u) platforms ECC is generated and checked by the UPA devices (CPU, SYSIO and PSYCHO), not by the memory controller (Address Controller or AC). Thus, ECC covers the entire data path between devices and memory.

***This means that an ECC error can be reported against a memory (DIMM/SIMM) that might not be bad ***

For a few ECC errors one may not recommend DIMM/SIMM replacement however in the case when the errors are exactly 12 hours apart the DIMM/SIMM must be replaced. Memory scrubber runs every 12 hours after the system is booted. The purpose of scanning physical memory is to read each memory location and determine if the data and ECC are correct. If the data does not match ECC, ECC will be rerun and correction made to memory content. If it fails exactly 12 hours apart it means the error appeared again despite of the correction, it will be corrected again however the DIMM/SIMM must be replaced.

check to see if memory scrubbing is enabled do:# echo disable_memscrub\ /X | adb -k

physmem 3b7bdisable_memscrub:disable_memscrub: 0

if it is "0" it is enabled if it is "1" it is disabled

Display a remote application GUI on your local server

When using telnet to connect to a remote server you can have the a application that has a GUIinterface (like VTS) display on your local server by doing the following:

1. # /usr/openwin/bin/xhost + (run this on your local server. 'xhost - ' removes permissions) 2. Connect to remote server and:

If using csh, use this syntax: If using sh or ksh, use this syntax: # setenv DISPLAY <hostname>:0.0 & # DISPLAY=<hostname>:0.0 example: # export DISPLAY # sentenv DISPLAY persia:0.0 &

3. Run application and the GUI should display on the local server

page 52

Page 58: Solaris OBP Reference Guide

Cluster 2.x http://suncluster.enghttp://neato.east/suncluster/scinstall.html (good install doc)

General:

Up to 4 nodes in cluster Only Sun Storage is supported (can get waiver, but seldom granted) HA or PDB (Parallel Data Base) HA - 1 server runs at up to 100% or 2 up 50 % so the other node can take over in case of failure PDB - Both servers access the database simutaiously, no logical hosts or shared ccd Supports Solaris 2.6, 7, 8 Supports QFE, SCI, fast ethernet, gigabit ethernet on the private net Supports different types of server nodes in the cluster Terminal concentrator is special model, it does not send a break on power on DMP and Fast Write Cache not supported (touch /kernel/drv/ap before vxvm install to not load DMP)

Cluster install (chapter 8 sun cluster 2.2 book)

Admin w/s Only requires end user distribution 2.2 release 7/00 has all the cluster related o/s patches install order: o/s, cluster patches, cluster software important files:

/etc/clusters logical hostname and nodes/etc/serialports node name and concentrator port

Server install Requires full distribution, 10k requires full+oem installer must be root Avoid 'scinstall' "change" option if possible. Use 'scconf'commandSoftware components:

CMM -Cluster Membership MonitorCCD - Cluster Configuration DatabaseSMA - Private Network ManagementSSVM/CVM - Volume managerPNM - Public Network ManagementLogical HostsDLM - Distributed Lock ManagerData Services

Topologies:Clustered PairN+1 (hot standby node)Ring or cascadeN to N scalable (cascading failover)Shared Nothing ( used for Informix parallel server)

OPS : (Oracle Parrell Server)

No logical hostsThe instants of Oracle syncing goes over the private networkNo shared CCD Must select CVM on install even with Volume Manager 3.0.4, to get OPS pick at end.Must install UDLM (Oracle CD) Create shared disk group while only one node in cluster.

Page 53

Page 59: Solaris OBP Reference Guide

Cluster 2.x (cont.)

Hardware Notes:Must change the initiator id on one node if using SCSI arrays between 2 nodes

(see procedure 5-17)If Quorum device is replaced it needs to be reconfigured.

#scconf - qA5000 - full loop only

must be mirrored DMP, FW cache not supported Direct or Hub attached (pg 5-23 5-27)

Wiring Diagrams(pg 5-30)

SCI - scrubber jumpers need to be 'on' on one node 'off' on all the other nodes /opt/SUNWsma/bin (has the SCI sm_config template files you need to

modify and run sm_config)switch1.sc (4 nodes, 8 cards, 2 switches)switch2.sc (2 nodes, 4 cards, 2 switches)link1.sc (2 nodes, 4 cards, 0 switches)

#/opt/SUNWsma/bin/sm_config - f template file

Terminal Concentrator - port 1 is used for setup (numbered 1-8 not 0-7) (pg 5-56)Enable setup mode - Power On < 30sec (test button) 15 more sec (test button)

should get monitor:: :: erase EEPROM (to set password to default, default is IP address of box)Remove the password from port 8 in a 3 node Nto N cluster for 'port locking'

Cluster Commands:

abort partition Same as scadmin stopnode... Use scadmin stopnode commandccdadm <clustname> -p ccd.database.ssa - creates a ccd.database.pure file for recovery use

-r ccd.database.pure - restores to ccd.database file-v verify consistancy of the dynamic copy of ccd.database-x convert the candidate file to a CCD database. Or verifies the CCD file.

ccp Command used to run the cluster control panel software on theadmin workstation

# ccp clustername &cconsole Command used to start up the cluster console on the admin W/S

# cconsoleget_node_status Command used to get the status of a node (also can use hastat and

scconf clustername - p commands)# get_node_status

haswitch Switch logical host to another node (will start the reconfiguration)# haswitch nodename

hastat Will give you the status of the cluster, will lie if private network is down. You can run it in the common window to get all views

# hastat (- m 0 skip messages) hareg registers data service with HA and associate the given logical

host.# hareg - s - r dataservice - h logicalhost

# hareg - y dataservicename (to turn on a dataservice)# hareg (to verify a service is turned on)# hareg - n dataservicename (to stop a data service)# hareg - u dataservicename (will shutdown dataservice on all

Page 54 logical hosts)

Page 60: Solaris OBP Reference Guide

Cluster 2.x (cont)

Cluster Commands: (cont)

pnmset Command to create PNM NAFO groups (on each node) for the publicnetwork interfaces to be used for the NFS data service.

# opt/SUNWpnm/bin/pnmset (follow interactive install)pnmstat - l Command lists the /etc/pnmconfig file (to set up NAFO groups)scadmin startcluster The first node into the cluster must enter with the 'cluster ' switch.

# scadmin startcluster nodename clusternamescadmin startnode All remaining nodes can join the cluster with the startnode switch

# scadmin startnode scadmin stopnode To remove your node from the cluster use the stopnode switch. (do

this before init or shutdown commands)# scadmin stopnode

scadmin switch Switch logical host to another node (will start the reconfiguration)same as haswitch command

# scadmin switch nodenamescconf Command used to configure cluster parameters (many, use MAN)

# scconf - F (creates admin filesystem, each node)# scconf - L (for logical hosts) (one node, diskset)# scconf - q (for quoram device)# scconf -N (to change a node ethernet address )

scdidadmn Command to initialize the Disk ID psudo driver (SDS install only)builds a file with paths from each node to disks

# scdidadm - r (on node 0 to initialize)# scdidadm - l (L) (verify DID configuration)

scinstall Installation command for Sun Cluster from CDscmgr Command to start Sun Cluster manager (cluster monitor) (set DISPLAY)

# /opt/SUNWcluster/bin/scmgr nodename &xhost Command on admin W/S to allow all xhost connections from

cluster nodes (graphics)# /usr/openwin/xhost +

Cluster Files:

/etc/opt/SUNWcluster/conf/clustername.cdbContains Install info, flat file use more command to view.

/etc/opt/SUNWcluster/conf/ccd.databaseContains cluster database, viewed by scconf, scadmin commands. If you have to restorethis file to a 'bad' node, you must reboot (file info is kept in memory)

/etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhostnameLogical hosts vfstab file

/etc/opt/SUNWcluster/conf/hanfs/dfstab.logicalhostnameLogical hosts dfstab file (shared filesystems)

/etc/clustersAdmin W/S file, contains cluster names and node names

/etc/serialportsAdmin W/S file, contains node names and port assignments on the consentrator

/etc/pnmconfigPublic network file. pnmset command creates, pnmstat - l command will list.

/etc/hostsYou must enter logical host name and IP.

Page 55

Page 61: Solaris OBP Reference Guide

Cluster 2.x (cont)

Cluster Files:

/etc/name_to_majorvxio must have the same number on both nodes to switch nfs logical host (unencapsulate first, change number)

/opt/SUNWcluster/binMost SC2.2 commands are located in this directory

/var/opt/SUNWclusterCluster error messages are located in this directory and in /var/adm/messages

Encapsulating root after using Environmental CD to load O/S:

The newer pci based servers come with a Operating Envrionment Installation CD to use with Solaris 2.5 and 2.6. This CD will create a mini-root partion and allows you to install and boot the serverfrom the older versions of Solaris.

The mini-root is currently Solaris 7 and starts at cylinder 0 on the boot disk. Once the intended versionof Solaris is loaded, the environmental CD makes mini-root (not mini-me) swap (slice1), leaving it startingat cylinder 0. This is alright if you are not encapsulating root.

When you then encapsulate root, swap (slice1) remains starting at cylinder 0, and veritas will not allow that space to be used for a core dump. It assumes it is reserved for the VTOC.

One way we have used to get around this is to boot from the Operating Envrionment Installation CD,load mini-root onto one disk and the intended O/S on another, through the custom install option. Thenboot from the other disk and encapsulate it.

Adding a second network interface:

(also see pg 48 - 49 How to configure a system to run on a network)This proceedure can also be used to add the first network interface and may work without booting the machine.

- add hostname and ip address to /etc/hosts file (hostname is usually hostanme_interface ex: sunnie_qfe0)- create a /etc/hostname. interface file # touch /etc/hostname.sunnie_qfe0- vi /etc/hostanme. interface file add entry at top (no spaces) hostname_interface - ifconfig interface (hme0,qfe0,ect.) plumb- ifconfig interface inet IP_address # ifconfig qfe0 inet 129.145.121.123- ifconfig interface netmask 255.255.255.0 # ifconfig qfe0 netmask 255.255.255.0- ifconfig interface broadcast IP_address.255 # ifconfig qfe0 broadcast 129.145.121.255- ifconfig interface up #ifconfig qfe0 up- ifconfig - a (if ready to use, should look like this:)

qfe0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500 inet 129.145.121.123 netmask ffffff00 broadcast 129.145.121.255 ether 8:0:20:88:xx:xx

*** Warning: touch the file /etc/notrouter so the server will not route between the two ethernet interfaces***

Adding a default gateway:

# route add default gateway_IP_addressthen vi /etc/defaultrouter and enter gateway_IP_address (to keep router thru reboots)

page 56

Page 62: Solaris OBP Reference Guide

Veritas Volume Manager :

Volume Manager takes physical disks and allows you to create logical volumes across these disks. A group of physical disks is called a 'disk group' All or portions of these physical disks can be combined to create logical 'volumes'You then can create filesystems on these logical volumes that span multiple physical disks.

Veritas Volumes have 2 partitions on them, a public and a private region. The public region is the size of the whole physical diskThe private region is 1024 sectors long. The configuration database is located in this region.There is enough room in the private region to define 128 disks. The private region is usually located at the beginning of a diskAnd is usually slice # 3.

If you run # prtvtoc /dev/rdsk/c#t#d#s2 on a disk initialized under vm (#vxdisksetup - i c#t#d#)a '15' in the Tag column output indicates the private regiona '14' in the Tag column output indicates the public region

Rules:

- There must be a rootdg, for vxvm to come up at boot. This is usually made when you install vxinstall volume manager and encapsulate your boot disk. Although you do not have to encapsulate the boot disk, rootdg can be made up any disk.

- You must have 2 unassigned slices to encapsulate a disk. (public and private regions)- vxunroot will unencapsulate a volume only if /, swap, /usr, /var, and /opt are the only

filesystems on the encapsulated disk.

General the flow of building logical volumes, creating a filesystem and mounting it, is as follows:

1. assign physical disks to free disk pool (to use with volume manager)# vxdisksetup - i c#t#d# c#t#d# (ect...)

2. create a disk group (uses disks in the free disk pool. You assign names. nconfig is private db copies, default is 4 and nlogs kernel logs, both switches are optional) # vxdg init diskgrp_name disk_name=cxtxdx nconfig=# nlog=#

3. add disks from the free disk pool to the diskgroup # vxvg - g diskgrp_name adddisk disk_name=cxtxdx disk_name=cxtxdx (ect...)

4. Create a logival volume in your disk group mirror # vxassist -g diskgrp_name -U fsgen make vol_name size layout=stripe nstripe=# disk_name disk_name (ect..)

(ex: 100m) raid5 {nolog}5. mirror a striped or concat logical volume (optional) # vxassist -g diskgrp_name mirror vol_name disk_name disk_name disk_name (ect..)

6. start the volume#vxvol start vol_name

7. Make the filesystem that sits on the logical volume# newfs /dev/vx/rdsk/ diskgrp_name/ vol_name

Page 57

Page 63: Solaris OBP Reference Guide

Veritas Volume Manager (cont):

General the flow of building logical volumes, creating a filesystem and mounting it, is as follows:

8. create a mount point (you decide dir_name)# mkdir /dir_name

9. Mount the filesystem on the mount point# mount /dev/vx/dsk/ diskgrp_name/ vol_name /dir_name

Break a mirror and unencapsulate:

# vxprint - htg rootdg (get the names of mirror plexes)# vxplex - g rootdg - o rm dis rootvol-02 swapvol-02 (use pl names from vxprint)# vxunroot (this will ask for a re-boot when completed)

(you can use vxdiskadm to re-encapsulate)

Break a mirror and take the plex to make another volume:

# vxprint - htg dg_name (find plex name of mirror volume you want to use)# vxplex - g dg_name dis plex_name (dissociate plex with volume)#vxmake - g dg_name - U fsgen vol vol_name plex=plex_name (make the volume)#mkdir /mp_name (create a mount point)#vxvol - g dg_name start vol_name (start the newly created volume)#mount /dev/vx/dsk/dg_name/vol_name /mp_name

To boot without Volume manager:

rem out 'vxio' lines in /etc/system (usually 2 lines at the end of vm section)copy /etc/vfstab to /etc/vfstab.vmcopy /etc/vfstab.prevm to /etc/vfstabtouch /etc/vx/reconfig.d/state.d/install-dbreboot(to reverse)uncomment 'vxio' lines in the /etc/system file (on both disks if root was mirrored)copy /etc/vfstab.vm to /etc/vfstab (on both disks if root was mirrored)rm /etc/vx/reconfig.d/state.d/install-dbreboot

Deport and Import a disk group:

# vxdg list (get a list of disk groups)# vxdg deport dg_name# vxdg import dg_name (can use - n name or - s for shared or - t for temporary optional switches)

Remove a volume from Volume Manager:

# umount / vol_name or (filesystem that sits on volume)# vxvol - g dg_name stop vol_name (stop the volume)# vxedit - g dg_name - r rm vol_name (recursivly removes volume, plex, and sub-disk from vm)

Page 58

Page 64: Solaris OBP Reference Guide

Veritas Volume Manager (cont):

Volume Manager commands:

vxdg free how much free space in a diskgroup: vxdg - g dg_name freevxdg list list all imported disk groups (exported use: vxdisk - s list | grep dgname)vxdg init Creates a disk group: vxdg init dg_name disk_name=c#t#d# vxdg adddisk Add disk to dg: vxdg - g dg_name adddisk disk_name=cxtxdxvxdg rmdisk Remove disk from dg: vxdg - g dg_name rmdisk disk_namevxdg upgrade Upgrade dg after VM upgrade: vxdg upgrade dg_namevxdg deport deport a dg: vxdg deport dg_name vxdg import import a dg: vxdg import dg_namevxassist make makes a logical volume: mirror

vxassist -g diskgrp_name -U fsgen make vol_name size layout=stripe nstripe=# disk_name disk_name (ect.raid5

vxassist maxsize what is the max size raid you can make in a disk group: mirror

vxassist - g dg_name maxsize layout=stripe nstripe=# raid5

vxassist mirror mirror a stripe or concat vol :vxassist - g dg_name vol_name disk_name(s) &vxassist remove mirror Used to remove a mirror permenemtly (do not use to break mirror)

vxassist - g dg_name remove mirror vol_namevxplex used to attach and dissociate plex(es) with volumes:

vxplex att vol_name plex_name or vxplex - o rm dis vol_name vxdisk - s list | grep dgname Gives you a listing of all disk groupsvxdisksetup - i used to add a disk to the volume manager free disk pool: vxdisksetup - i c#t#d#vxdiskunsetup - C used to remove a disk from the free disk pool: vxdiskunsetup - C c#t#d#vxdiskadd will do both the vxdisksetup and vxdg adddisk: vxdiskadd c#t#d#vxvol start start a volume after it was made with vxassist or vxmake: vxvol start vol_namevxvol stop used to stop a volume after a umount: vxvol stop vol_namevxedit - r rm allows you to recursivly remove a volume, plex or subdisk: vxedit - r rm vol_name

plex_name

vxmake sd manually make a sub-disk: vxmake sd sd_name offset=# len=size disk=disk_namevxmake plex manually make a plex from a sub disk: vxmake plex plex_name sd=sd_namevxmake vol manually make a volume from a plex: vxmake - U fsgen vol vol_name plex=plex_namevxunroot unencapsulates a disk: vxunroot disk_namevxdiskadm menu driven disk adminiatrationvxio set set the number of vxio deamons (default is 10. 2/cpu is recommended) : vxio set #

permently set daemons in the s85vxvm-startup2 file.

Volume Manager files:

/etc/vx/bin:/opt/VRTSvmsa/bin:. Set PATH to:/etc/vx/reconfig.d/state.d/install-db Touch this file to prevent Volume manager from starting/etc/vx/reconfig.d/disk.d/cxtxdx/var/opt/vmsa/logs/commands all GUI commands are located here/etc/vfstab.prevm Copy of the vfstab before vm was installed/opt/VRTS/bin/vea GUI for version 3.5

Page 59

Page 65: Solaris OBP Reference Guide

FTPing to and from sunsolve:

You can use this to temporarily store files that you may want to access at a customers site or tosend files from a customer site that you can retreive on swan.Anything sent to sunsolve will be deleted after two days

Internal to sunsolve:(change to directory where the file you want to send resides)

# rftp sunsolve.sun.comName : anonymous or suncorePassword: (enter your e-mail address or suncore passwd changes weekly check url:)

https://livelink.central.sun.com/livelink/livelink?func=ll&objId=5537115&objAction=browse&sort=nameftp> cd coresftp> mkdir dir_name (as of 5/01 you cannot create directories. Skip to bin command)ftp>cd dir_nameftp>pwd257 "/cores/dir_name" is current directory.ftp> bin ftp> put file_name_to_be_sent ftp> quit#

External from sunsolve:

# ftp sunsolve.sun.com (192.9.9.24)login: anonymouspassword: your_email_addressftp> cd cores/dir_name/ (as of 5/01 you cannot create directories. Skip to bin command)ftp> binftp> get file_name_to_be_retrievedftp> quit#

Page 60

Page 66: Solaris OBP Reference Guide

Serengeti: 3800 - 6800

General: (first supported O/S on serengeti Solaris 8 4/01)

Serengeti 8 (3800):

Support for 2 to 8 Ultrasparc III processors (2 system bds max)Up to 64 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU be installed without a bank but a populated bank must have corresponding CPU installed) 12 hot-swappable compact pci (cPCI) slotsUp to 2 domainsPower Server: up to 3 power supplies nema 6-15P (connect internal to rack )

Rack mount: up to 2 NEMA L6-30PSerengeti 12 (4800):

Support for 2 to 12 Ultrasparc III processors (3 system bds max)Up to 96 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU be installed without a bank but a populated bank must have corresponding CPU installed)16 PCI slots or * 8 hot swappable cPCI slots or *combination of 8 PCI and 4 cPCIUp to 2 domainsPower Server: up to 3 power supplies nema 6-15P

Rack mount: up to 2 NEMA L6-30P

Serengeti 12i (4810): (100% front access for specialized environments.)

Support for 2 to 12 Ultrasparc III processors (3 system bds max)Up to 96 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU be installed without a bank but a populated bank must have corresponding CPU installed)16 PCI slots or * 8 hot swappable cPCI slots or *combination of 8 PCI and 4 cPCIUp to 2 domains Power Server: up to 3 power supplies nema 6-15P (connect internal to rack )

Rack mount: up to 2 NEMA L6-30P

Serengeti 24 (6800):

Support for 2 to 24 Ultrasparc III processors (6 system bds max)Up to 192 Gbytes of Memory (8 banks of 4 dimms each. 2 banks/CPU. possible that a CPU be installed without a bank but a populated bank must have corresponding CPU installed)32 PCI slots or * 16 hot swappable cPCI slots or *combination of PCI and cPCIUp to 4 domains (2 domains / partition)Power Rack mount: up to 4 NEMA L6-30P

Hardware:

SC Board: System Console. You can tip or telnet to the SC card to configure/maintain the server.(SSC) There are 3 shells you can acess and configure from the SC, Platform shell, Domain shell

and O/S shell on a specific domain. The SC bd is part of the platform, it is not configured into a domain. A second (slave) SC board is installed if the redundancy kit is ordered. The SC runs it's own O/S and is upgraded and backed up across the ethernet connection.

Repeater Bds: The repeater boards establish and maintain the connections between the system boards(RP) and the IO boats. The 3800 and 4800 have 2 repeater boards, although the circuitry

for the repeaters on the 3800 is on the centerplane. The 6800 has 4 repeater bds.

* When available Page 61

Page 67: Solaris OBP Reference Guide

Serengeti: 3800 - 6800: (cont.)

System Boards: The system board is common across all 3 servers. It can have 2 or 4 CPUs (SB) installed on it (they are not field replaceable). The system board has sockets

for 8 banks of 4 dimms. Each CPU has 2 corresponding dimm banks. It is possible that a CPU might not have any dimms installed in its corresponding banks. However, a populated dimm bank must have a corresponding CPU installed.

I/O boat: The I/O boat types : PCI or cPCI, no sbus I/O boat. The PCI and compact PCI(IB) adapters are installed in the I/O boats. Currently cpci is only available on the 3800.

ID Board: The ID board is a pre-programmed daughter board that is on the centerplane.The 3800s ID board is incorporated into the centerplane. The id board hasthe System chasis id #, System serial #/host id, (6) MAC adresses for the 6800and (4) for the 3800, 4800.

LEDS: on off Activate (green): Bd is activated. You must Bd is not activated: you can

NOT remove the board when remove the board when this this LED is on LED is off

Fault (amber): an internal fault occurred No internal fault occurred

Removal ok (amber): you can safely remove the you must not remove thecomponent under hot-pluggable component under hot-pluggable conditions. conditions.

Partitioning:

You can configure the server in single or dual partition mode. If you select dual partition mode, each partition will be electrically separated from the other. The 3800 (on bd repeaters)and the 48x0 have dual repeaters one will be configured for each partition, the 6800 has 4 repeater bds, 2 will be configured for each partition. Dual partition mode is recommended forkeeping domains electrically separated.

Domains: On the serengeti, you configure the resources you want allocated to each domain. The domain(like on an E10K) then becomes an independent server. At a minimum each domain must have a system bd, I/O boat with ethernet/scsi PCI card, and a boot disk.

Domain/Partition configurations:

3800/4800/6800: configuration Domain IDs1 partition 1 domain A1 partition 2 domains A,B2 partitions 2 domains A,C (1 per partition, 6800 see comment below)

6800: Domains A,B even bd #s grid0, C,D odd bd#s grid1 (best practices)2 partitions 3 domains ABC, ABD, ACD, or BCD 2 partitions 4 domains A,B,C,D

Page 62

Page 68: Solaris OBP Reference Guide

Serengeti: 3800 - 6800: (cont.)

To connect to the SCC:# tip hardwire (from admin workstation, or notebook pc to SCC0 console port)# telnet ip_address_of_SC (ip address of sc must be configured > setupplatform)

Power on hardware:Connect to SCC enter 0 (platform shell) > poweron all to verify: > showboards -v

Switch to domain: (from platform shell) > console -d [a, b, c or d]

Power on domain: (from domain shell) > setkeyswitch on (will start POST) to verify >showkeyswitch(once POST is complete should get OK prompt and be able to run standard OBP commands and boot)

Power off domain: (after init 0, shutdown, ect)(from domain shell) > setkeyswitch off (wait, takes a while)

Power off platform: (after domain(s) are set off)(from platform shell) > poweroff all

update SC firmware: (from platform shell on the SC)

>flashupdate -y -f ftp://root:password@host_ip/path_to_new_firmware all rtos

Run this command from the platform shell. Keep in mind this command will notupdate the slave SC. To update it you must make it the primary or run the commandfrom the slave SC.

>flashupdate -c <source board> <replacement board> (to copy firmware btwn like bds)

Save SC configuration: (from platform shell on the SC)

> dumpconfig - f ftp://root:password@host_ip/path_to_dumpdir

Restore SC Configuration: (from platform shell on the SC)

> restoreconfig - f ftp://root:password@host_ip/path_to_dumpdir

To create/modify Platform: (from platform shell on the SC)

> setupplatform (enter information and modify ACLs . for each domain use deleteboard and addboard - d commands )

To create/modify Domain: (from Domain shell on the SC)> setupdomain (enter information. Defaults in [ ]s) Page 63

Page 69: Solaris OBP Reference Guide

Serengeti: 3800 - 6800: (cont.)

Navigating between shells:

When you first connect: enter 0 (platform) 1 (domain A) 2 (domain B) ect...Platform -> Domain > console - d [A,B,C,D ] (will go to OBP, O/S or shell)Domain -> Platform > disconnectDomain -> OBP > break (after 'setkeyswitch on ' had been run)OBP -> Domain ctrl ] or ~# or (telnet) ctl ] send break or (ssh) #. Domain -> O/S > resume (after O/S was brought up via 'boot' command)O/S -> Domain ctrl ] or ~# or (telnet) ctl ] send break or (ssh)#.

Platform Shell commands: (command - h will give you a listing... ** command avalable on slave SC)addboard assign a board to a domain -d, connections ** show connections to the system controller or a domainconsole connect to a domain shell/console -ddeleteboard delete a board from a domaindisablecomponent add a component to the blacklistdisconnect ** disconnect this connection or a specified connectiondumpconfig ** save the system controller configuration to a serverenablecomponent delete a component from the blacklistflashupdate ** update flash prom images -y, -f,help show help for a command or list commandshistory ** show shell command historypassword change platform or domain passwordpoweroff turn components offpoweron turn components onreboot ** reboot the system controllerreset ** reset the other system controllerrestoreconfig ** restore the system controller configuration from a serverservice service mode (see page 94)setchs (service cmd) setchs -s ok, suspect, faulty -r "reason for status" -c /N0/SB2/p2setdate set the date and time for the platformsetdefaults set default configuration valuessetescape (5.16.00) change escape charectors (default #.)setfailover (5.13.00) changes the state of SC failover on, off, force,setkeyswitch set the keyswitch position for a domain onset-keygen (5.16.00) Generates/lists ssh host keys/fingerprint -l -rsetupplatform ** configure the platform -p acls, -p partition, showboards show board information -d,-e, -p, -v,showchs (service cmd) shows chs status (use with setchs)showcomponent show state of a component -v,showdate show the current date and time for the platformshowenvironment show environment sensors -u, -w, -l, -p, -v,showescape (5.16.00) lists escape charectorshowframe show frame information -v,showfailover (5.13.00) displays SC and clock failover statusshowfru (5.16.00) list frus in system -r manrshowkeyswitch show the keyswitch positionsshowlogs show the logs -d, -v,showplatform ** show the status of domain and platform configuration -d, -p, -v, showsc ** show system controller uptime, version, and configuration -v,sshrestart (5.16.00) restarts ssh server to put new host keys into effecttestboard test a boardtestinterconnect run interconnect test (available in service mode only)

Page 64

Page 70: Solaris OBP Reference Guide

Serengeti: 3800 - 6800: (cont.)

Domain Shell commands: (command - h will give you a listing... )

addboard assign a board to a domain -dbreak send break to the domain consoleconnections show connections to the domaindeleteboard delete a board from a domaindisablecomponent add a component to the blacklistdisconnect disconnect this connectionenablecomponent delete a component from the blacklisthelp show help for a command or list commandshistory show shell command historypassword change domain passwordpoweroff turn components offpoweron turn components onreset (-x) reset the domain (XIR, will dump a hung domain)resume return to domain consolesetdate set the date and time for the domainsetdefaults set default configuration valuessetkeyswitch set the keyswitch position on, offsetupdomain configure the domain -v,showboards show board information -v,showcomponent show state of a component -v,showdate show the current date and time for the domainshowdomain show domain configuration -v showenvironment show environment sensors -v,showkeyswitch show the keyswitch positionshowlogs show the logs -vtestboard test a board

Setup remote logging:

In setupplatform:Syslog loghost [ ] : ip_of_adminStation Log Facility [ ]: local0 (can be 0-7)

In setupdomain: (for each domain)Syslog loghost [ ] : ip_of_adminStation Log Facility [ ]: local1 (can be 0-7)

In syslog.conf on admin station:local0.notice /var/adm/messages.platformlocal1.notice /var/adm/messages.domainA(ect...)

Admin station:create the files: # touch /var/adm/messages.nnnnnnn restart syslog: # kill -HUP `cat /etc/syslog.pid` or ( /etc/init.d/syslog stop) ( /etc/init.d/syslog start)

(continued on next page)

Page 65

Page 71: Solaris OBP Reference Guide

Setup remote logging: (cont.) /usr/lib/newsyslog file: (so logs do not grow forever. On line 2 enter all message file names you created.)

--- Change --- --- To ---LOG=messages #LOG=messagescd /var/adm for LOG in messages messages.platform messages.domainA (ect..)test -f $LOG.2 && mv $LOG.2 $LOG.3 do

test -f $LOG.1 && mv $LOG.1 $LOG.2 cd /var/admtest -f $LOG.0 && mv $LOG.0 $LOG.1 test -f $LOG.2 && mv $LOG.2 $LOG.3mv $LOG $LOG.0 test -f $LOG.1 && mv $LOG.1 $LOG.2

cp /dev/null $LOG test -f $LOG.0 && mv $LOG.0 $LOG.1 chmod 644 $LOG mv $LOG $LOG.0

cp /dev/null $LOGchmod 644 $LOG

doneto test logging:

- # logger -p local0.notice "test message for platform log file" (check contents of log files to make sure logging is working) (if not check permissions on log file)

setfailover off /on and check log file on log host (if not snoop interface, make sure log entry is reaching loghost also make sure syslogd is not running with the -t switch)

Notes:- Use 'connections' command to see if ghost sessions are keeping you from connecting to a domain. (reset the SC , from slave sc or reset button, to remove those sessions.)- Use the dash (-) to remove an entry when running setupplatform

Firmware: http://pts-americas.west/esg/msg/techinfo/platform/sun_fire/firmware-matrix/Patch # SC Firmware CPU (MHz) Domain Firmware Other features

-------- --------- --------------- ------------ 112127-xx 5.12.5 750/900 (Masks 2.1/2.2 only) 5.12.x 5.12.6 750/900 (Masks 2.1/2.2 only) 5.12.x DR 5.12.7 750/All 900 5.12.x DR/900 2.3 112494-xx 5.13.0 750/All 900 5.12.x or 5.13.x DR/ SC auto failover

5.13.1 750/All 900 5.12.x or 5.13.x “ 5.13.2 750/All 900/1050 5.12.x or 5.13.x DR/1050/failover 5.13.3 750/All 900/1050 5.12.x or 5.13.x “

750/All 900/1050 5.12.x or 5.13.x “ 5.13.5 750/All 900/1050 5.12.x or 5.13.x “ /L2 timing

112883-xx 5.14.0 750/All 900/1050 5.12.x, 5.13.x or 5.14.0 DR/Failover/COD 5.14.4 750/All 900/1050/1200 5.12.x, 5.13.x or 5.14.x “ /L2 timing

112884-xx 5.15.0 750/All 900/1050/1200 ASR114523-01 5.16.0 750/All 900/1050/1200 SSH

Freshchoice (scsi2/ethernet) adapter firmware has problem booting CDROM. Bug 4397457

workaround: To patch get-mail of ISP fcode to give longer timeout period:

(set nvram parameter fcode-debug? to true.)

ok cd /ssm@0,0/pci@b,2000/pci@2/SUNW,isptwo@4ok patch 100 64 get-mailok

Page 66

Page 72: Solaris OBP Reference Guide

Mounting and unmounting CD without vold:

to stop vold : (automount daemon for cdrom and floppy) # /etc/init.d/volmgt stop

to mount cdrom: # mount -F hsfs -o ro /dev/dsk/c0t6d0s0 /cdrom

to unmount cdrom: # umount /cdrom

to start vold : # /etc/init.d/volmgt start

Send a file using mailx Command line:

# mailx -r return_email_address -s subject_no_spaces sendto_email_address < filename

This will dump the file into the heart of the e-mail. Use for text documents, post output ect...

Send a message using mailx Command line:

# mailx -r return_email_address -s subject_no_spaces sendto_email_addressCc: (enter cc: e-mail address if any)Type text of e-mail (control d) when finishedEOT#

More T3 info:

Forgotten password:reset the T3press (return) within 3 seconds of reset (on the console sesion you have open)type set passwd (this will display the current password)

T3 Logging: (you will need to modify the T3s host file and syslog.conf file by ftping them to a unixserver, edit them, send the files back to the T3 and reset the T3)

You should already have the T3 connected to the network and be able to telnet to the T3type 'set' to make sure you have an ip, netmask, gateway, and hostname on the T3

:/: set logto *modify T3s host file (add ip and hostname of loghost)modify T3s syslog.conf (add line '*.info @ip_address_of_loghost')modify loghost syslog.conf file (add line ' local7.info [tab] /var/adm/messages.t3') must use local7 touch /var/adm/messages.t3 on loghostkill -HUP syslog.d pid or stop and restart it on the loghostftp modified host and syslog.conf to the T3reset the T3 to have changes take effect

Page 67

Page 73: Solaris OBP Reference Guide

StarCat 15K:

General:

StarCat 15K:Has 18 available slots for system board sets. In each of the 18 available slots, you can configure (1)

System bd slot 0 bd and (1) hsPCI, MaxCPU or SunFire Link bd slot 1 bd.

Supports up to 18 system bds (72 CPUs) and a combination of, not to exceed 18 total slot1 bds:up to 17 MaxCPU bds (34 CPUs),up to 18 hsPCI boards (2 3.3v and 2 5v PCI adapters slots per board),up to 18 SunFire link boards. (includes: 1 3.3v and 1 5v PCI adapter slots per board)Domains: up to 18 DomainsPower requirements: (12) NEMA L6-30P (2 seperate power grids)

System Board set: (up to 18)

System board set is made up of a system board (slot 0 bd) and a slot 1 type board. A slot 1type board is usually a I/O (hsPCI) board, but can be a SunFire link or MaxCPU bd. Theslot0 and slot1 boards are physically mounted on a 'carrier plate and expander board'.The expander bd/carrier plate is then inserted into one of the 18 available slots of the StarCat.

Control Board Set: (2) See Fin I0771-1 (keep old id bd if replaceing CP1500 bd on the SC) Also see I0761-1 (upgrade CP1500 post & OBP )

Control Board set is made up of a 'System Controller Bd', 'System Controller Perepheral Bd'and a 'CenterPlane Support Bd'. The system controller runs solaris and the SMS packages.The System controller peripheral board has 2 SDS mirrored boot disks, DVD-rom and a 4mm DATthat are used by the System Controller board. The System Controller bd and SC peripheral bd are mounted on the CenterPlane Support bd. TheCenterplane support bd is then inserted into one of the 2 control bd slots on the StarCat. The Control Bd set provides system clock, I2C monitoring bus, console bus to all domains,serial port and 2 net ports to outside world, serial port internal to other SC and internal net connectionto each domain and other SC.

The SCs come with a O/S installed in a 'sys-unconfig' state. When you run smsconfig -m to configure your SCs, it is easiest if the SCs are on the network and able to reach their default gateway.IPMP contacts the gateway to determine if the physical interface is up.

Floating = community hostname and IP address. This address will follow the main SC failover = virtual IP and hostname that will float between hme0 and eri1 on each SChme0, eri1= regular IP and hostnames for the interfaces

SC console port pinout: (plus null modem info for connection to 25pin/9pin serial ports)

o o o td (2/3)<--- rd – 5 > o o o < 3 -- td ----> (3/2) rd dtr (20/4)<--- cts -- 2 > o | o < 1 – dtr ----> (6,8/6,1) dsr, dcd

| 4 gnd (7/5)

Page 68

Page 74: Solaris OBP Reference Guide

StarCat 15K: (cont...)Example of IPMP configuration on Sun Fire 15K system controllers C (Community) Network:

System Controller Floating IP <===== .150 | _____________|_____________ | |

_______ .151 .152 ======> IPMP Logical IP failover Address / | | / ______|_________ _______|________

IPMP | | | | LEVEL | SC0 | | SC1 |

\ hme0 eri1 hme0 eri1 \_ .100 .101 .200 .201 ===> IPMP Test IP Address

Private internal net interfaces:

scman0: SC's internal ethernet interface to each domain ( I1 network )scman1: SC's internal interface to other SC (I2 network)dman0: Domain's internal ethernet interface to each SC and domain ( I1 network)

15k O/S install:

System controller: The SC's are fully functional servers with 2 SDS mirrored 18gb disks, DVD-rom and a 4mm DAT. They will come already loaded from the factory with Solarisand SMS. At this time, there is no way to create the ' idprom.image' files in the field (so make sure they are backed up). The default login and password is sms-svc, sms-svc.

Domain install: If the domain has a D240 attached the install (after creating the domain: setupplatform, deleteboard, addboard, setobpparams, setkeyswitch) can be done fromthe D240s DVD-rom. If you do not have a DVD-rom attached to the domain you are loadingyou will most likley have to boot net.

To boot from an install server: (on install server) - add the ethernet address and node_name in the /etc/ethers file - add the node_name in the /etc/hosts file# add_install_client node_name sun4u (solaris CD in the Tools directory, keep CD mounted)- check the /tftpboot directory for created files (file name hex representation of nodse IP address)- check the /etc/bootparams file for node_name

(on the domain)- check out your network interfaces ok> watch-net-all- check out network interface alias ok> devalias- change if desired interface is not listed, nvunalias, show-nets, nvalias, nvstore- boot net_alias -install

Blacklist: is populated/unpopulated by hand or with the 'enablecomponent'/'disablecomponent'commands. The path is /etc/opt/SUNWSMS/config/platform or A-R/ blacklist.Use the 'hpost -? blacklist' command to list possible entries

.postrc: The path is /etc/opt/SUNWSMS/config/platform or A-R/ .postrc.Use the 'hpost -? .postrc' command to list possible entries

Page 69

Page 75: Solaris OBP Reference Guide

StarCat 15K: (cont...)

Send BREAK to domain: (be careful, will stop solaris): from the console connection: ~# (goes right to OK prompt, NOT domain shell like serengetti)

Decoding CPU locations: 15k /SUNW,UltraSPARC III @1c 2,0 | |

change to decimal CPU ID = 0-3 system bd, 8,9 (MaxCPU bd.) divide by 2 result is EX slot 1C16=28 28/2=14 EX slot=14

Decoding Memory locations: 15k memory offset 4=bank0 6=bank1

|/SUNW,memory-controller @12 2,400000 | |

change to decimal CPU ID = 0-3 system bd, 8,9 (MaxCPU bd.) divide by 2 result is EX slot 1216=18 18/2=9 EX slot=9

Decoding I/O card locations: 15k

c= IOC0 d= IOC1 (slot 0 or 1) (slot 2 or 3) | always 1 board type | | |

/pci@17c,700000/pci@1/SUNW,isptwo@4/disk@0,0 | | |

change to decimal 6= I/O slot 0 or 2 device identifier divide by 2 7= I/O slot 1 or 3 result is EX slot 1716=23 23/2=11 r1 EX slot=11

SMS (Sun Management Server)

Default login: sms-svcDefault password: sms-svc

SMS daemons:

dca - domain configuration agent. One for every POST. Talks to dcs on domain (only on active SC.)dsmd - domain status monitoring daemon (only on active SC.)dxs - domain X server. One for each domain. (only on active SC.)efe - event front-end daemon. Part of SMC acts as intermediarybtwn SMC agent and SMS (only act SC)

Page 70

Page 76: Solaris OBP Reference Guide

StarCat 15K: (cont...)SMS daemons: (cont)

esmd - environmentalstatus monitoring daemon (only on active SC)fomd - failover monitoring daemonfrad - field replaceableunit access daemonhwad - hardware access daemonkmd - key management daemon (only on active SC)mand - management network daemonmld - messages logging daemonosd - OpenBoot Server daemon (only on active SC.)pcd - platform configuration database daemon (only on active SC)ssd - SMS startup daemontmd - task management daemon (only on active SC)

SMS Files:

/export/home/sms-svc/.sms_env - SMS user environment /export/home/sms-svc/.cshrc - SMS user .cshrc /export/home/sms-svc/.login - SMS user .login /etc/opt/SUNWSMS/.sms_groups - sms groups file

/etc/opt/SUNWSMS/config/dsmd_tuning.txt - Domain status and monitoring daemon tuning info /etc/opt/SUNWSMS/config/esmd_tuning.txt - Environmental status and monitoring daemon tuning info /etc/opt/SUNWSMS/config/fomd.cf - Failover monitoring daemon config file /etc/opt/SUNWSMS/config/fomd_sys_datasync.cf - Failover monitoring daemon datasync file /etc/opt/SUNWSMS/config/platform/.postrc - Platform specific .postrc file /etc/opt/SUNWSMS/config/platform/blacklist - Platform specific blacklist file /etc/opt/SUNWSMS/config/A/.postrc - Domain specific (A-R) .postrc file /etc/opt/SUNWSMS/config/A/blacklist - Domains specific (A-R) blacklist /etc/opt/SUNWSMS/startup/ssd_start - Start script for the ssd daemons /etc/opt/SUNWSMS/startup/sms_env.sh -

/var/opt/SUNWSMS/.pcd/domain_info - Platform configuration database daemon domain info /var/opt/SUNWSMS/.pcd/platform_info - Platform configuration database daemon platform info /var/opt/SUNWSMS/.pcd/sysboard_info - Platform configuration database daemon sysboard info

/var/opt/SUNWSMS/adm/.logger - Message logging daemon specifics /var/opt/SUNWSMS/data/osdTimeDeltas - OpenBoot Prom server daemon info file /var/opt/SUNWSMS/data/A/nvramdata - Domains specific (A-R) nvram information /var/opt/SUNWSMS/data/A/idprom.image - Domains specific (A-R) idprom information /var/opt/SUNWSMS/data/A/bootparamdata - Domains specific (A-R) boot parameters

SMS commands: (/opt/SUNWSMS/bin)

addboard - assigns, attaches and configures a board to the domain (domain_id|domain_tag.) addtag - adds the specified domain tag name (new_tag) to a domain cancelcmdsync - The command synchronization commands work together to control the recovery of

user-defined scripts interrupted by a system controller (SC) failoverPage 71

Page 77: Solaris OBP Reference Guide

SMS commands :(cont)

console - creates a remote connection to the domain's virtual console driver, making the window in which the command is executed a "console window" for the specified domain deleteboard - removes a board from the domain it is currently assigned to deletetag - remove the domain tag name associated with the domain disablecomponent - adds a component to the domain or platform blacklist enablecomponent - removes a component from the platform, domain or ASR blacklist flashupdate - updates the Flash PROM in the system controller (SC), and the Flash PROMs in a domain's CPU and MaxCPU boards, given the board location.(/opt/SUNWSMS/firmware)

ex: flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash SB1 (leave Name blank to do all SBs) fruupdate (command in 'help' listing, but no description or man page) help - displays a list of valid SMS commands along with their correct syntax initcmdsync - The command synchronization commands work together to control the recovery of user-defined scripts interrupted by a system controller (SC) failover marginclock [-f (65|75|83.333) | -s synth-freq | -m [+/-] margin-percent][-y] marginvoltage [-p1.5] [-p2.5] [-p3.3] [-p5.0] [-pcore] [-m(0|+|-)] [-d domain_id|domain_tag] [-d domain_id|domain_tag...] [-b location] [-b location...] [-y] moveboard - first attempts to unassign location from the domain it is currently assigned to and possibly active in, then proceeds to assign, connect, and configure location to the domain poweroff - powers off the specified dual 48V power supply, fan tray, or board poweron - powers on the specified dual 48V power supply, fan tray, or board reset - allows you to reset one or more domains in one of two ways: reset the hardware to a clean state or send an externally initiated reset (XIR) signal resetsc - resets the other SC runcmdsync - command prepares the specified script for automatic synchronization (recovery) after a failover. Savecmdsync - The command synchronization commands work together to control the recovery of user-defined scripts interrupted by a system controller (SC) failover setbus - perform dynamic bus reconfiguration on active expanders in a domain setchs - SMS1.4 set component health status. SMS can auto fail components. Setchs lets you change the status setcsn - SMS1.4 set chasis serial number. allows you to set csn once. (showplatform) # setcsn -c serial# setdatasync - schedule filename enables you to specify a user-created file to be added to or removed from the data propagation list. setdate - allows the SC platform administrator to set the SC or optionally a domain date and time values. Allows domain administrators to set the date and time values for their domains. setdefaults - removes all SMS instances of a previously active domain. A domain instance includes all pcd entries except network information; all message, console, and syslog log files; and, optionally, all NVRAM and boot parameters. pcd entries and NVRAM and boot parameters are returned to system default settings setfailover - provides the ability to modify the state of failover for the SC failover mechanisms setkeyswitch - changes the position of the virtual keyswitch to the specified value setobpparams - allows a domain administrator to set the virtual NVRAM and REBOOT variables passed to OpenBoot PROM by setkeyswitch setupplatform - sets up the available component list for domains. showboards - displays board assignments showbus - display the bus configuration of expanders in active domains showchs - SMS1.4 displays component health status. EX: showchs -r sb15 showcmdsync - displays the command synchronization list to be used by the spare system controller (SC) to determine which commands or scripts need to be restarted after an SC failover. showcomponent - displays whether the specified component is listed in the platform, domain, or ASR blacklist file. showdatasync - provides the current status of files propagated (copied) from the main SC to its spare showdate - display the date and time for the system controller (SC) or a domain showdevices - displays the configured physical devices on system boards and the resources made available by these devices. Page 72

Page 78: Solaris OBP Reference Guide

StarCat 15K: (cont...)

showenvironment - displays the environmental data showfailover - provides the ability to monitor the state of the SC failover mechanism. showkeyswitch - displays the position of the virtual keyswitch of the specified domain showlogs - displays platform or domain log files. The default is the platform message log. showobpparams - allows a domain administrator to display the virtual NVRAM and REBOOT parameters passed to OpenBoot PROM by setkeyswitch showplatform - Show the available component list and domain state for domains. showxirstate - displays CPU dump information after sending a reset pulse to the processors smsbackup - creates a cpio(1) archive of files that maintain the operational environment of SMS smsconfig -m - configures and modifies the host name and IP address settings used by the MAN daemon, mand (must have SCs on the network and able to contact the default router for IPMP to work.) smsrestore - restores the operational environment of the SMS from a backup file created by smsbackup smsversion - Displays the active version and exits when only one version of SMS is installed. sysid {-d domain_id|-f filename} [-m YYYYMMDDhhmm] [-M machineType (defaults to 0x82)] [-e etherAddr] [-s serial#|-H host_id] sysid -F textIDPROMfile -f newBinaryfile thermcal - Use command if replaceing a csb bd. testemail - SMS1.4 allows you to generate a test emailto verify SMS logging and recipients xir [-d domain_id|domain_tag [-d domain_id|domain_tag]...] [-q] [-y]

local-mac-address :The "local-mac-address?" eeprom parameter is used enable the MAC addresses which are burnt-in on

network cards.false - do not use the card's burnt-in adresses, use the nvram default address for all interfaces

(shown on obp banner)true - use the on-board MAC address (if there is any). This setting is necessary to get a

unique MAC address per interface.

The default setting of the local-mac-address? is set to "false". On non clustered servers the installation engineer must not forget to set local-mac-address? to true to avoid having one MAC address several times in the network, which causes network problems.

SDS - How to mirror the root disk Use this procedure to mirror the system disk partitions using Solstice DiskSuite:

- first format the second disk exactly like the original root disk: (typically s7 is reserved for metadatabase)

# prtvtoc /dev/rdsk/c0t0d0s2 > /tmp/firstdisk# fmthard -s /tmp/firstdisk /dev/rdsk/c1t0d0s2

- create at least 3 state database replicas on unused (10mb) slices.

# metadb -a -f -c 3 c0t0d0s7 c1t0d0s7 (-a and -f options create the initial state database replicas. -c 3 puts three state database replicas on each specified slice)

- for each slice, you must create 3 new metadevices: one for the existing slice, one for the slice on themirrored disk, and one for the mirror. To do this, make the appropriate entries in the md.tab file.

slice 0, create the following entries in (/etc/lvm/md.tab)

d10 1 1 /dev/dsk/c0t0d0s0 d20 1 1 /dev/dsk/c1t0d0s0 d0 -m d10

Page 73

Page 79: Solaris OBP Reference Guide

SDS - How to mirror the root disk (cont...)

slice 1, create the following entries in (/etc/lvm/md.tab) d11 1 1 /dev/dsk/c0t0d0s1 d21 1 1 /dev/dsk/c1t0d0s1 d1 -m d11

Follow this example, creating groups of 3 entries for each data slice on the root disk.

- run the metainit command to create all the metadevices you have just defined in the md.tab file. If you use the -a option, all the metadevices defined in the md.tab will be created.

# metainit -a -f (-f is required because the slices on the root disk are currently mounted)

- make a backup copy of the vfstab file: # cp /etc/vfstab /etc/vfstab.pre_sds

- run the metaroot command for the metadevice you designated for the root mirror. In the example above, we created d0 to be the mirror device for the root partition, so we would run:

# metaroot d0

- edit the /etc/vfstab file to change each slice to the appropriate metadevice. 'metaroot' command has already done this for you for the root slice.

/dev/dsk/c0t0d0s1 - - swap - no - to

/dev/md/dsk/d1 - - swap - no -

Make sure that you change the slice to the main mirror, d1 not to the simple submirror, d11.

- reboot the system. Do not proceed without rebooting your system, or data corruption will occur.

- After the system has rebooted, you can verify that root and other slices are under DiskSuite's control:

# df -k # swap -l

The outputs of these commands should reflect the metadevice names, not the slice names.

- Last, attach the second submirror to the metamirror device.

# metattach d0 d20 (must be done for each partition on the disk, and will start the syncing of data)

- to follow the progress of this syncing for this mirror, enter the command

# metastat d0

Although you can run all the metattach commands one right after another, it is a good idea to run the next metattach command only after the first syncing has completed. Once you have attached all the submirrors to the metamirrors, and all the syncing has completed, your root disk is mirrored.

Page 74

Page 80: Solaris OBP Reference Guide

IPMP: (Solaris 8 Update 2 10/01)

General Description:

IPMP allows you to create a logical IP address that can be swapped on-the-fly to anotherphysical network interface.

IPMP Test IP Address: physical interfaces (hme0,qfex,ge). This address is used by IPMP to determinethe status of the physical interface. It is not for use by applications.

IPMP Logical IP Address: IP address is used by applications for data transfers to and from the server. This IP address will failover between the configured interfaces.

_______ .151 ======> IPMP Logical IP failover Address / | / ______|_________

IPMP | | LEVEL | |

\ hme0 qfe0 \_ .100 .101 ===> IPMP Test IP Address

Setup ipv4 IPMP: (IPMP group w/ 1 stanndby interface) see IP Multipathing Admin Guide

ok> setenv local-mac-address? true # ifconfig hme0 plumb 172.20.66.100 netmask + broadcast +

# ifconfig qfe0 plumb # ifconfig hme0 group test-group

# ifconfig qfe0 group test-group # ifconfig hme0 addif 172.20.66.151 netmask + broadcast + -failover deprecated up # ifconfig qfe0 plumb 172.20.66.101 netmask + broadcast + deprecated -failover standby up # ifconfig -a

/etc/hostname.hme0 : 172.20.66.100 netmask + broadcast + group test-group up \ addif 172.20.66.151 deprecated -failover netmask + broadcast + up

/etc/hostname.qfe0 :

172.20.66.101 netmask + broadcast + deprecated group test-group -failover standby up

Page 75

Page 81: Solaris OBP Reference Guide

T3B or T3+ Firmware Rev 2.1 New Functions:

Volume slicing:- Create max 16 slices within a T3, either WG or PP.- Layered on top of volumes. If volume is unmounted all slices go away.- Volume slices cannot be seen until the voilume is initalized and mounted.- Minimum size is 1GB, increments of 1GB, starts on GB boundaries.- Maximum size is size of volume.- Once enabled cannot be disabled.

EX: (simple example of sliceing a volume on a t3+)Enabled by new system variable enable_volslice. sys enable_volslice (Note: if volslice is enabled, you must create a slice to see lun in format)

vol add vol_name data u#d#-# raid # standby* u#d9 vol init vol_name data rate(1-16) optional

volslice create slice_name -z size vol_namevolslice listlun perm list (should be rw, else `lun default all_lun rw')vol mount vol_name

Lun mapping and masking:- Enabled with volume slicing.- Each slice must be mapped to a lun.- Slices can be renumbered to unused lun.- Luns range from number 0 to 15.- Lun masking controls access to lun- Lun permissions can be none, ro (read only), rw (read write).- Lun permissions set for all or by WWN of hba.- Default lun permissions is rw when slice is created from existing volume.- Lun permissions are nonewhen slices are made of volume created after volume slicing is enabled.

New Mapping / Masking command: lunMapping: lun map add lun <lun#> slice <slice#>

lun map rm lun <lun#> [slice <slice#>]lun map rm alllun map list [lun <lun#> | slice <slice#>]

Masking: lun permlun perm listlun defaultlun wwn listlun wwn rm alllun wwn rm wwn <wwn#>

WWN Groups: Allows groups of wwns to share security features, saves lazy typists.

New command: hwwnhwwn add <grp_name> wwn <wwn#>, rm <grp_name> wwn <wwn#>hwwn list <grp_name>hwwn rmgrp <grp_name>hwwn listgrp

Fabric Support: Enabled thru new sys variable fc_topology, three possible settings:- auto: chooses between loop and fabric_p2p, depending on capability of attached device.- loop: establishes an arbitrated loop connection thru a translated loop (TL) port- fabric_p2p: establishes a fabric connection thru an F portNTP can also run in the array to sync time with external server.

Page 76

Page 82: Solaris OBP Reference Guide

Hitachi StorEdge 99X0 Arrays:

SE9910- Single cabinet, logic boards in front, disk drives in rear.Max 16GB cache, 24 host ports, 48 disk drives.up to 4096 logical devices can be configured and presented.

SE9960- One DKC logic cabinet, one to six DKU disk cabinets, arranged on right and left (R1-3, L1-3). R1 is added first, add on alternate sides for best performance.

Max 32GB cache, 32 host ports, 512 disk drives.up to 4096 logical devices can be configured and presented.

SE9970V- Single cabinet,logic boards in front, disk drives in rear.Max 32GB cache, 48 host ports, 128 disk drives.up to 8092 logical devices can be configured and presented.

SE9980V- One DKC logic cabine, one to four DKU cabinets. Added same as 9960.Max 64GB cache, 64 host ports, 1024 disk drives.up to 8092 logical devices can be configured and presented.

All use the concept of "storage clusters" redundant combinations of cache boards, host adapter boards (CHA) and disk adapters boards (DKA). All array transactions run through the cache.

Drives are set up in either RAID 5 or RAID 1 (1+0).

Basic building block is called the B4, which is 4 trays of disks (HDUs). In 9910 and 9970 B4 is all 4 HDUs ofdisks, in 9960 and 9980 a B4 is 4 (of 8) HDUs in a cabinet (bottom 4 or top 4). HDUs will be numbered in Nshape. The same 4 drives in a B4 are a parity group, which is where the RAID level is set. A parity group will

always be 4 drives. In 9970 and 9980 parity groups can span 2 B4's.

B4's are numbered 1 through 12; 1 and 2 are in cabinet R1, 3 and 4 are in L1, 5 and 6 are in R2 etc. Disk drives in each 9910 and 9960 HDU are numbered 0 through B (11), thus 12 drives. Disk drives in each 9970 and 9980 HDU are numbered 00 thru 0F and 10 thru 1F. Accesssing drives 10 thru 1f requires an additional card in the HDU.

Each parity group is set to an emulation mode, the system then divides that parity group into the appropriate number of LDEV's based on the emulation mode sizing. LDEV's can be presented on the host ports as LUN's as is or combined to create larger LUNs.

In 9910 and 9960 drive B (top last drive on left) in each HDU in the L1 and R1 DKU's is used as a universal spare, the bottom B4 drive B will always be a spare if installed, the top B4 drive B may be designated as spares or may be a normal parity group. In a 9910 any drives installed in slot B will be spares. In 9970 and 9980, drive 0F will be the spare (top left drive next to center cards). Same rules apply for slot 0F as B in 9910 and 9960.

In 9970 the HDU can be "split" using special cards to create two B4's.

Service Processor (SVP):

Windows PC mounted in array. 9970 and 9980 have optional second SVP mounted in cold standby.Two modes of operation, View and Modify, View will come on when the Remote Console is connected. Disconnect Remote Console or reboot SVP to go back to Modify mode.

Page 77

Page 83: Solaris OBP Reference Guide

Hitachi StorEdge 99X0 Arrays: (cont...)

Switches on the SVP Main Panel: Information- allows review of messages (SIMs)Maintenance- Select a component for replacementDiagnosis-FD Copy- Create a configuration floppy diskInstall- Initial Setup, microcode upgrades, etc.

Default remote console login: USER USER

Passwords:raid-initialsetupraid-installraid-onlinehorc-forcibly

SVP FUNCTION tabs:LDEV: format initialize drives/parity groupsHORC or Open TruCopy: Copy between subsystems LUN Manager: map LUNs to portsDCR: Dynamic Cache Residency aka Flash Access LUN is mapped into cacheShadow Image: copy data within subsystemCVS/Virtual LUN: (small volumes) smaller than emulation mode size... use wasted space,

make small volumes for DCROn Demand/Just In Time: add additional spaceLDEV Security SANtinel: LUN Masking

MAINTENANCE: lots of jumpers on boards, must be carefully checked. All changes must be made thru modify mode on the svp, carefully following the procedures. Repair procedures have a pre change section, a change section and a post change section, follow all steps. USE THE MANUALS (on CD comes with the firmware) !!

SunFire forgotten password: (SRDB 26846) This procedure works with firmware version 5.11.3 and higher.

If the platform administrator's password is lost, the following procedure can be used to clear the password.

1. Reboot the System Controller (SC). You won't be able to do this by logging into the platform shell. You'll need to hit the reset button on the SC to do this.

2. The normal sequence of a System Controller rebooting is for SCPOST to run, then ScApp. You'll need to wait for ScApp to start loading, then hit Control-A to spawn a vxWorks shell. SCPOST is done running when you see the message 'POST Complete'. At this point, ScApp will begin to load. When you see the copyright message 'Copyright 2001 Sun Microsystems, Inc. All rights reserved.', Hit CONTROL-A. You should see the following:

Task not found spawning new shell. ->

Page 78

Page 84: Solaris OBP Reference Guide

Sunfire forgotten password: (cont:)

This last line is the vxWorks prompt. Keep in mind, that ScApp will still continue to load all the way to the point of giving you the menu to enter the platform/domain shells. To make it less confusing, wait for the ScApp menu to display on your screen, then hit return. You should see the vxWorks prompt -> again.

3. Make a note of the current boot flags settings. This will be used to restore the boot flags to the original value.

-> getBootFlags()

value = 48 = 0xC = '0' (Save the 0x number for # 8 below.)

4. Change the boot flags to disable autoboot.

-> setBootFlags (0x10)

5. Reboot the System Controller (CONTROL-X or reboot ). Once reset, it will stop at the -> prompt.

6. If you are running firmware 5.17.x or above, enter the following commands, otherwise, go to step 7:-> ld 1,0,"/sc/flash/vxAddOn.o"If you are running firmware 5.17.x or 5.18.x, enter the following command at the prompt-> uncompressJVM("/sc/flash/JVM.zip", "/sc/flash/JVM"); If you are running firmware 5.19.x or later, enter the following command at the prompt-> uncompressFile("/sc/flash/JVM.zip", "/sc/flash/JVM");

7. Enter the following commands at the -> prompt. -> kernelTimeSlice 5-> javaConfig -> javaClassPathSet "/sc/flash/lib/scapp.jar:/sc/flash/lib/jdmkrt.jar"-> javaLoadLibraryPathSet "/sc/flash" -> java "-Djava.compiler=NONE -Dline.separator=\r\n sun.serengeti.cli.Password"

Wait for the following System Controller messages to display. Your prompt will come back right away, but it'll take about 10 seconds for these messages to show up:

Clearing SC Platform password...

Done. Reboot System Controller.

8. After the above messages are displayed, restore the bootflags to the original value using thesetBootFlags() command.

-> setBootFlags (0xC) (Use the value returned from #3 above. )

9. Reboot the System Controller using CONTROL-X or the reboot command. Once rebooted, the platform administrator's password will be cleared.

Default Storage switch passwords: (telnet to the switches in the san.) Sun 1GB switch: user: root passwd: ma31_glw Sun 2GB switch: user: admin passwd: password Brocade Switch: user: admin passwd: silkworm

Page 79

Page 85: Solaris OBP Reference Guide

StorEdge Network FC Switch:

The StorEdge Network FC Switch are replacing the fibre hubs. When you receive them they are configured as similar to a hub (all ports one zone). The switch will initially get it's IP addressby RARPing (though it has a default IP of 10.0.0.1). You cannot telnet to the switch, you must usethe GUI to configure (may change with future firmware).

Remember: each array in a zone must have a unique tag address or box id...

Setup: (on server)- load San Foundation Kit (SUNWsan packages) http://storage.east/san– load and patch SanSurfer GUI (pkgadd -d SUNWsmgr) EIS CD /sun/patch/SAN/8/– add ethernet address and switch_name to /etc/ethers – add Ip address and switch_name to /etc/hosts – check in.rarpd is up: ps -eaf | grep in.rarpd (start if not up /usr/sbin/in.rarpd -a &)– turn on FC switch– ping Ip address of switch– bring up GUI SanSurfer ( java -jar /usr/opt/SUNWsmgr/bin/Sun.jar) or

( /usr/opt/SUNWsmgr/bin/esm_smgr)– login (default login: su, password: su) (can't login? add patch 110696)– Click on IP Address and enter switch IP– Configure the switch as needed. (rate field >20 scan rate for app to get stats)

To set up zoning: (from Fabric Screen)- click on IP address of switch / zoom / zoning / add zone / click on port / apply

To edit network config: (from Fabric Screen)- double click on `Fabric Name' of switch

To veiw zone config: (from Fabric Screen) - click on IP address of switch / zoom / zoning / zone index 1,2,3 ect...

To clear all zones: (from Fabric Screen)- click on IP address of switch / zoom / zoning / clear all zones

Useful SAN commands:

luxadm fcode -p (lists SUN/QLOGIC HBAs and firmware on each). luxadm -e port (Here you would be looking for a connected status for your device in question.) luxadm -e dump_map /devices/pci@1f,4000/SUNW,qlc@4/fp@0,0:devctl (path is from above command)

luxadm probe luxadm display <path> (path from above or WWN) ls -l /dev/cfg (This will show you paths to controller mapping.) cfgadm -al (View what fabric devices are seen and configured and their condition)cfgadm -c configure c# (to configure a device ex: cfgadm -c configure c5::50020f2300000cab) cfgadm -o show_FCP_dev -al (list luns under each device. very handy when troubleshooting lun issues). ls -l /dev/fc (give you fp to path mappings)prtconf -vp|grep -i wwn (will give you the wwn of all configured HBAs on the system, this is a snap shot of

what the prom saw at boot).

Page 80

Page 86: Solaris OBP Reference Guide

Hitachi Lightning 9900V notes: also see: http://storage.east/hitachi

DKC - Disk (subsystem) Control Unit DKU - Disk only frame: up to 4 DKUs : Left 1 (L1), Right 1 (R1) , Left 2 (L2), Right 2 (R2) 9980vSVP - Superviser Console: 1/DKC standard, optional: 2nd SVP/DKC (NOTHING EXTRA loaded on SVP!!)ACP / DKA - Array Control Processor / Disk adapter : same thing connects to FSWsCHA- Channel Adapter: contains fiber ports to connect to serverSM - Shared Memory: located on Cache bds, contains subsystem metadata MDL - Maintenance Documentation LibraryPDL - Product Documentation Library ( includes User Guide -theory)SIM - System Information Message (message led blinking means it cannot talk to SVP) reference numbers

can be looked up in SIMRC.PDF manual (on m/c CD) Action code points to a work ID (USE MANUALS!!!)

SSID - SubSystem ID: asigned number associated with: mainframes, 'Trucopy', 'Shadow Image"HDD - Hard Disk DriveHDU - Hard Disk Unit: up to 32 HDDs in a HDU: slot 0f is spareB4 - Group of 4 HDUs (N shaped numbering, 0,2 on bottom: 1, 3 on top) 9970 has (1) B4 unless it has

FSW 'c' cards then 2. 9980 B4 numbering: (R1) 1,2 (L1) 3,4 (R2) 5,6 (L2) 7,8 (lowest # on bottom)FSW - Fiber Channel Interface Switch: PCB in HDU. Connects to DKA. (3) types A, B, C (switches) SC - Single Cabinet (9970)MC - Multi Cabinet (9980)

Cluster - set of boards in a subsystem. 2 clusters: CL1, CL2. Mirror config across clustersEmulations - Lun Specifications (what type of disk drive do you want the lun to appear to be?)

Cannot Hot SWAP: Backplane, FSW 'B' boards

Available Raid Types: Raid 5 Raid 10

CU - Control Unit - a addressable list of Ldevs in shared memory. Rule of thumb: use sametype of Ldev in a CU. If using another type of Ldev in system put them in another CU. Max 32 CUs 256 Ldev/CU

LUSE - Lun Size Expansion: Make large Lun from Ldevs (concatinate)CVS/VLL - Make smaller Luns from free space 35gb and lower (must be smaller than emulation size selected)

Pariy Group (aka: Array Group): 4 disks only. Select physical disks, Select emulation, (this will giveyou a number of Ldevs depending on emulation) Assign Ldevs to CU

Lun Mapping: Map a Ldev to ports on the CHAs. Done thru Storage Navigator. Host mode 0 is standard, host mode 9 for Solaris, host mode C for windows

Host Groups: When Lun Security is on upto 128 host groups/ port. Can config host mode and have lun0 per group. Need to know WWN of HBA

High Speed Mode: All the processers on a CHA will be working 1 port : 1 port 4 procs (other 3 ports disabled)

Standard speed mode : 1 processor per port on a 4port CHA, 1 proc/2ports 8 port CHA

Offline SVP: Software (m/c CD) to load on your PC. Use to configure without SVP. Requires config floppy

DCI - Define Configure Install: DCI operation destroys customer data use for new install only. Use 'Change Configuration' on existing subsystems. (Shift ctl i raid-initialsetup)

Page 81

Page 87: Solaris OBP Reference Guide

Hitachi Lightning 9900V notes: Cont.

How to figure needed disk capacity: (but don't forget spares)

Customer wants (10) 500gb luns. How many HDDs do you need?1, (1) 500gb lun = (14) 36gb open-L Ldevs 500/36= 13 r32 (round up to 14)2, (10) luns = 140 Ldevs 14x10=1403, parity groups = 24 6 Ldevs/parity group 140/6= 23 r2 (round up to 24)4, 96 HDDs required 24 parity groups x 4 disks/group = 96 disks

Spare Disk Drives: Are available to any array groupSpares install in slot 0f of each HDUManditory: B4-1, B4-3Optional: B4-2, B4-4

Adding Frames: Watch HDU Jumper locations when adding frames.

Microcode CD: - Read ECN (engineering Change Notice) comes with m/c CD- Includes Manuals (use them) - Includes Offline SVP software

M/C Upgrade Sequence:- SVP- Everything but DKU- DKU

If message led is on, check subsystem status: (if blinking communication problem with the SVP)- Maintenance button on SVP

Special Key strokes:shift-ctl i 'raid-initialsetup' used for DCIalt-shift > update config disketteshift-ctl m 'mode' puts you in mode mode for m/c upgradesraid-install used in disk replacement

Storage Navigator - Allows you to do Lun mapping, LUSE, CVS, DCR, True Copy, Shadow Imagefrom a client thru the lan to the SVP. Make sure the SVP is not in 'modify' mode soyou can get write access. Default login: root pwd: roothttp://ipaddress-main-SVP//cgi-bin/utility/sjc0000.cgi

DCR/Flashaccess - Dynamic Cache Residency: Will keep a Ldev resident in cache, save on transfer time.If purchased set it up on install, will save downtime later

Page 82

Page 88: Solaris OBP Reference Guide

Hitachi Lightning 9900Vnotes: Cont.

HDLM - Hitach Dynamic Link Manager: Loaded on the server similar to DMP. /opt/dynamiclinkmanager/log /bin

Defaults: Sun Windows Setting

Path Health Check off off 15 - 1444 minauto failback none off

HDLM commands:# dlnkmgr veiw (-path), (-sys),

offline (-path) online (-path) set -ellv log-level, -elfs log-size, -systflv trace-level, -pchk, -s clear help

True Copy: Remote copy to another disk subsystem (9900 to 9900). Mainly used for disaster recovery. You configure it on each subsystem using Storage Navigator. One will be the Master (MCU) and the other Remote (RCU).

2 transfer methods:

SYNC: Data that is transferred to the MCU is inturn sent to the RCUthru a dedicated port. When the data is acknowleged at the RCU the MCU sends an acknowlegement back to the HBA

ASYNC: Data sent to the MCU is acknowleged to the HBA before the MCU receives acknowlegement from the RCU

The dedicated port has to be configured as 'initiator' on the MCU and 'RCU target' on the RCU.This port is a point to point connection between the disk subsystems. The PVOL is the primary volume (Ldev) the data is sent to it from the server.The SVOL is the secondary volume (Ldev) on the RCU that True Copy copies to.

True Copy Volume States:

SMPL - simplex volume prior to any pair operation or result of 'pairsplit -s' commandCOPY - (initial copy in progress) a result of a 'paircreate' command PAIR - initial copy complete and doing updates as data changes on pvolPSUS - pair operations suspended as a result of a 'pairsplit' command PSUE - pair operations suspended as a result of a failure

To setup True Copy:– Decide on PVOL and SVOL– SSID (need to know, get from customer)– Serial number of each disk subsystem– setup Path between the subsystems (ports, cables, ect...)– Define MCU to RCU path– ASYNC only Define Consistancy Groups (order in which you want data sent to svol)– True Copy (create pairs)

Page 83

Page 89: Solaris OBP Reference Guide

Hitachi Lightning 9900V notes: Cont.

Shadow Image: A local copy within a disk subsystem. Configured using Storage Navigator.

The PVOL is the primary volume (Ldev) the data is sent to from the server. The SVOL is the secondary volume (Ldev) that Shadow Image copies to. You can can have a max of 9 copies (svols), this includes(3)level 1 SVOLs and (6) level2 SVOLs (cascade)

Level1 Level2 _______S

_____S / | \ ________S

| _______S P_____S / | \ ________S

| _______S |_____S /

\ ________S

Shadow Image Commands:

paircreate: starts a initial copy and results in a PVOL SVOL pairpairsplit: splits the pair. quick or steady options. Level1 must be split before level2

a split will syncronized data btwn the PVOL and SVOL before the split.Pairresync: Will resyncronize a suspended pair.

Quick Functions:quicksplit : makes it possible to read and write SVOLs immediately after splitquickresync: reduces the resync time considerablyquickrestore: reduces restore time considerabaly

Minnow StorEdge 3300 Series array: (also see page 110 for disk replacement)

OEM'd from Dot Hill. Small cheap array. Scsi hardware raid and jbod. Fiber array soon.Raid levels 0, 1, 3, 5, 1+0, and 0+1 supported.Up to 12 drives per box, 2 redundant RAID controllers.

Model 3310 Ultra 160 LVD SCSI (will work Single Ended as well).Use new LVD card and SUNWqus driver.

Logical Disk or Group- the raid setup from the disks.Logical Volume- a raid of logical disks (how they do 1+0 and 0+1).Partitioning- may map a chunk of LD or LV.Local spares assigned to particular LD or LVGlobal spares assigned within array.

Luns are created and owned by one controller, other is failover for it. Controllers can be active/active or active/passive. All interface to array is done thru the master controller.

Parts are raid controllers (2), event monitor units (emus) (2), power supplies (2), terminator board (1), io board(1), disks (12). All hot swappable. Replacing terminator and io board will interupt io.

Page 84

Page 90: Solaris OBP Reference Guide

Minnow StorEdge 3300 Series array: cont...

Cableing can be complex, refer to manual. 4 channels within box, two are for host, 2 for drivesSingle bus- all drives same channel.Dual bus- split drives between two channels (split drives 1-6 & 7-12, channels 0 &2).

Any box combination, maximum of 16 drives per any one channel.

IO BoardChannels 0 and 2 are drive channelsChannel 1 and 3 are host channel portsSB and DB ports are “jumper” ports: Single bus jumper cable from channel 0 to SB port.

Dual bus jumper cable from channel 2 to DB port.

Expansion unit (JBOD) has no controllers, has 4 port IO Board (A, Aterm, B, Bterm).Aterm and Bterm are self terminating ports, need to be at end of chain.Single bus in expander jumper cable from B to Aterm.Dual bus in expander no jumper cable installed.If adding an expander to a “controller” box run the cable to the “non term” ports.

Box Management thru serial port or GUI (GUI doesn't work well yet).

If using network connect both controllers to same subnet, only master controller has ip address. IP assigned by DHCP or static thru serial port connection.Standard RS232 null modem (9 pin female) serial cable to either controller. Settings are 38400 baud, 8N1. control-l refreshes screen (if just connected to running array hit control-l choose VT100 mode)control-w switches between the controllers.control-acbd reset to factory defaults, password “oemmaint”

Config tool is a text based menu, common to all arrays, main selections are: (use Return and ESC to navigate)

view & edit logical drives (create, expand, delete, raid configs, partition, set spares)view & edit logical volumes (create, delete logical volumes)view & edit host luns (assign lun id's and map host channels)view & edit scsi drives (view drive status,flash drive leds, set global spares, clone drives)view & edit scsi channels (status, properties, set controller target id)view & edit config parameters (controller settings, set baud and ip address)view & edit peripheral devices (set expansion box, secondary controller, array status (emu))veiw system information (cache size, firmware revision, Ect...)system functions (reset, shutdown, fw upgrade)event logs

Create LUNs: (in general, example does not use logical volumes so no “+” raid levels)setup qlobal spares- v/e scsi drives–select disk- add global spare- yessetup logical drive- v/e logical drives–select LG–create logical drive-yes-raid-select disks-capacity-ESCpartition logical drive- v/e logical drives-select logical drive-partition-select partition(arrow)-size-yesmap luns to host- v/e host luns-select controler-select lun#-select logical drive-select partition-map(y)

Modify /kernel/drv/sd.conf: (must do for all lun #'s other than 0)create the following 2 line entry for each lun: name="sd" class="scsi"

target=# lun=#; (change target and lun)

Page 85

Page 91: Solaris OBP Reference Guide

Tuning ecache scrubber scan rate:

See FIN I0755-1.The following procedure can helpon UltraSparc II servers that experience ecache failures. Best used on servers that mirrored ecache is not an option.The procedure increases the scan rate from 100 times a second to 1000 times a second.It will increase the system utilization by about 1%.

To adjust ecache_scan_rate: 1. As root, run the following command to adjust ecache_scan_rate.

# echo 'ecache_scan_rate/W 0t1000' | adb -kw NOTE: This does not require downtime. Be very careful, though, as mis-typing the command could

result in downtime.

2. To make the change permanent, add the parameter setting to /etc/system. It is best to insert all 3 parameters together into /etc/system if the settings are not already there:

set ecache_scrub_enable=1 set ecache_scan_rate=1000 set ecache_calls_a_sec=100

To check a system's current setting use the following command.This does not modify the setting in any way:

# echo 'ecache_scan_rate/D' | adb -k

VxWorks (serengeti SC): Use when you cannot get into scapp or to recover a failed SC flashupdate

- Reset the SC using the reset button on the front of the SC. - when “ Copyright 2001-2002 Sun Microsystems, Inc. All rights reserved.

Use is subject to license terms. “ appears hit CTRL A

->setBootFlags(0x0) CTRL X (will reboot and stop booting at the -> PROMPT. ->setBootFlags(0xd) ( then “reboot” to Change the boot settings back so SC automatically boots ScApp)

- to configure a netmask: -> ifMaskSet("eri0", 0xffffff00) (example will set it to 255.255.255.0) - to configure an IP address: -> ifAddrSet("eri0", "129.146.232.222") - to enable the network interface: -> ifFlagSet("eri0", 0x8063) - to configure a default router: -> routeAdd("0.0.0.0", "129.146.232.10") - to Register the name/address of a server: -> hostAdd("myhost", "129.146.240.105") - to test the network interface : -> ping "myhost",1 or ping “129.146.240.105”,1 - to Update the ScApp flashprom in Vxworks: updateBootFlashURL("ftp://login:password@myhost/path_to/sgrtos.flash") updateScAppFlashURL("ftp://login:password@myhost/path_to/sgsc.flash") Page 86

Page 92: Solaris OBP Reference Guide

LVD PCI Adapter: (ultra scsi-3 375-3057)

Code named jasper, it is a Low Voltage Differential card. Mainly supports the S1, D2 and Minnow (SE 3310) arrays.The LVD drivers are not on any Solaris CD yet, 8 02/02 or 9. You will have to either make a temp boot diskand patch it or boot net from a patched image to see the disks on a LVD adapter, until a bootable CD is released that has driver support for the LVD.

do the following to see disks on a LVD adapter:(drivers and patches available on EIS cd sun/progs, sun/patch)

- add_install_server (create the solaris 8 image)- download "QUS" drivers from www.sun.com(all four) SUNWqus, SUNWqusu, SUNWqusx and SUNWqusux.- download patch 112697-02 from sunsolve- pkgadd -R /boot_dir/Solaris_8/Tools/Boot -d . (add drivers to boot image)- patchadd -R /boot_dir/Solaris_8/Tools/Boot 112697-02 (patch boot image)- add_install_client (enable client to boot net from server)- boot net (client)

Once loaded you can install Solaris on the LVD disks. But you have to select 'manualboot' so you can then patch the install image before reboot as follows:

- cd /net/ipaddress_of_install_server/shared_dir_where_pkgs_located/- pkgadd -R /a -d . (add all four pkgs 32 and 64 bit)- patchadd -R /a 112697-02- reboot

see doc 816-2156-11.pdf StorageEdge PCI Dual Ultra3 SCSI Host Adapter Install Guide.

CP1500 - (15k SC) replacement: (see fin I0761-1)

The Nordica bd is used both in the netra line and the SC of a 15k. When replacing the Nordica bd (501-5473) in a 15k you have to upgrade the OBP so you will have all the SC functionality.The info doc says you should do the procedure on rev -12 and below. We had to do the procedure on a -13 board to get it to work(without it we could not see the 'man' network interfaces). In general you have to: (see fin and download readme for specifics)

– download the CP1500 OBP image – run the downloaded script (updates CP1500 OBP to 3.14.6)– flash the SC (flashupdate)– reset OBP parameters

You can find “The current Nordica OBP firmware image available for download” at : http://pts-americas.west/esg/hsg/starcat/patches.html

Serengetti /15k Dynamic Reconfiguration: Min Requires Solaris 8 (02/02 u7) SC 5.12.6 (also see 15k dr examples page 109)

(Solaris commands)To get a list of component NAMES: # cfgadm -alTo remove a bd from a domain: # cfgadm -o unassign,nopoweroff -c disconnect NAME (ex: N0.SB1)To add a bd into a domain: # cfgadm -v -c configure NAME (ex: N0.SB1)To see if board has perm mem: # cfgadm -val | grep permanent

Page 87

Page 93: Solaris OBP Reference Guide

To Clean up non-root disc “controler” numbers: (see info docs 15019, 27756)

# mv /etc/path_to_inst /etc/path_to_inst.orig # rm /etc/path_to_inst.old # cd /dev/dsk # rm c1* c2* c3* c4* (do not remove your boot device) # cd /dev/rdsk # rm c1* c2* c3* c4* (do not remove your boot device)

# rm -rf /dev/cfg/* (new on solaris 8)

If boot disk is under Sun StorEdge Volume Manager, search for "rootdev:" in /etc/system.ex: rootdev: /pseudo/vxio@0:0 (Write down this device name exactly, you will use it on boot.)

# init 0 ok boot -ar (take the default through all prompts except: “Do you want to rebuild this file [n]?” y )

(and if you have the boot disk under StorEdge Volume Manager, when asked for)( the physical root device, enter the device name you found above)

Set network parameters at boot:

ok> boot net:speed=100,duplex=full (no spaces)

Starcat Portid cheat sheet: Decimal:

------------------------------------------------------------------| Exp| cpu0| cpu1| cpu2| cpu3| max0| max1| pci0| pci1| axq0| axq1|

------------------------------------------------------------------| 0 | 0 | 1 | 2 | 3 | 8 | 9 | 28 | 29 | 30 | 31 || 1 | 32 | 33 | 34 | 35 | 40 | 41 | 60 | 61 | 62 | 63 || 2 | 64 | 65 | 66 | 67 | 72 | 73 | 92 | 93 | 94 | 95 || 3 | 96 | 97 | 98 | 99 | 104 | 105 | 124 | 125 | 126 | 127 || 4 | 128 | 129 | 130 | 131 | 136 | 137 | 156 | 157 | 158 | 159 || 5 | 160 | 161 | 162 | 163 | 168 | 169 | 188 | 189 | 190 | 191 || 6 | 192 | 193 | 194 | 195 | 200 | 201 | 220 | 221 | 222 | 223 || 7 | 224 | 225 | 226 | 227 | 232 | 233 | 252 | 253 | 254 | 255 || 8 | 256 | 257 | 258 | 259 | 264 | 265 | 284 | 285 | 286 | 287 || 9 | 288 | 289 | 290 | 291 | 296 | 297 | 316 | 317 | 318 | 319 || 10 | 320 | 321 | 322 | 323 | 328 | 329 | 348 | 349 | 350 | 351 || 11 | 352 | 353 | 354 | 355 | 360 | 361 | 380 | 381 | 382 | 383 || 12 | 384 | 385 | 386 | 387 | 392 | 393 | 412 | 413 | 414 | 415 || 13 | 416 | 417 | 418 | 419 | 424 | 425 | 444 | 445 | 446 | 447 || 14 | 448 | 449 | 450 | 451 | 456 | 457 | 476 | 477 | 478 | 479 || 15 | 480 | 481 | 482 | 483 | 488 | 489 | 508 | 509 | 510 | 511 || 16 | 512 | 513 | 514 | 515 | 520 | 521 | 540 | 541 | 542 | 543 || 17 | 544 | 545 | 546 | 547 | 552 | 553 | 572 | 573 | 574 | 575 |------------------------------------------------------------------

In Hex: ------------------------------------------------------------------

| Exp| cpu0| cpu1| cpu2| cpu3| max0| max1| pci0| pci1| axq0| axq1| ------------------------------------------------------------------| 0 | 0 | 1 | 2 | 3 | 8 | 9 | 1c | 1d | 1e | 1f || 1 | 20 | 21 | 22 | 23 | 28 | 29 | 3c | 3d | 3e | 3f || 2 | 40 | 41 | 42 | 43 | 48 | 49 | 5c | 5d | 5e | 5f || 3 | 60 | 61 | 62 | 63 | 68 | 69 | 7c | 7d | 7e | 7f || 4 | 80 | 81 | 82 | 83 | 88 | 89 | 9c | 9d | 9e | 9f || 5 | a0 | a1 | a2 | a3 | a8 | a9 | bc | bd | be | bf || 6 | c0 | c1 | c2 | c3 | c8 | c9 | dc | dd | de | df || 7 | e0 | e1 | e2 | e3 | e8 | e9 | fc | fd | fe | ff || 8 | 100 | 101 | 102 | 103 | 108 | 109 | 11c | 11d | 11e | 11f || 9 | 120 | 121 | 122 | 123 | 128 | 129 | 13c | 13d | 13e | 13f || 10 | 140 | 141 | 142 | 143 | 148 | 149 | 15c | 15d | 15e | 15f || 11 | 160 | 161 | 162 | 163| 168 | 169 | 17c | 17d | 17e | 17f || 12 | 180 | 181 | 182 | 183| 188 | 189 | 19c | 19d | 19e | 19f || 13 | 1a0 | 1a1 | 1a2 | 1a3 | 1a8 | 1a9 | 1bc | 1bd | 1be | 1bf || 14 | 1c0 | 1c1 | 1c2 | 1c3 | 1c8 | 1c9 | 1dc | 1dd | 1de | 1df || 15 | 1e0 | 1e1 | 1e2 | 1e3 | 1e8 | 1e9 | 1fc | 1fd | 1fe | 1ff || 16 | 200 | 201 | 202 | 203 | 208 | 209 | 21c | 21d | 21e | 21f || 17 | 220 | 221 | 222 | 223 | 228 | 229 | 23c| 23d | 23e | 23f | ------------------------------------------------------------------

Page 88

Page 94: Solaris OBP Reference Guide

Starcat SC: clean the slate: (bring down domains)

Clean dump and post files in /var/opt/SUNWSMS/adm/A-RRemove all boards from domains: ex: # deleteboard SB0 SB1 IO0 IO1 ect...Stop SMS both SCs: /etc/init.d/sms stop# mv /etc/opt/SUNWSMS/SMS1.3/config/MAN.cf /etc/opt/SUNWSMS/SMS1.3/config/MAN.old# sys-unconfig Without the MAN.cf file it is as though smsconfig -m has never been run.

Starcat redx info: check out : http://pts-americas.west/esg/hsg/starcat/tools/xcredx.html

#redx -l (will put you in local mode to look at dumps. redxl.csh for non SC analysis) redxl>dumpf load dump-file-name (will load dump and give you a brief summary)redxl>dumpf types (will list the domain board configuration)redxl> wfail (will give you failure info “1E”= 1st error “1E+”= accumulated errors)

SB (slot 0) redx commands:redxl> shproc 0 0 3 (show PROC. 0 0 3 = exb0 slot0 cpu 3 shproc connects to DCDS, SDC, AR, SBBC)redxl> shdcds 0 0 1 (show DCDS. 0 0 1= exb 0 slot0 dcds 1 shdcds connects to PROC, DX)redxl> shdx 0 0 3 (show DX. 0 0 3= exb 0 slot0 dx 3 shdx connects to SDI(exb) DCDS)redxl> shar 0 0 (show AR. 0 0 = exb 0 slot0 shar connects to AQX(exb) SDI 0(exb) PROCs)redxl> shbbc 0 0 1 (show SBBC. 0 0 1 = exb 0 slot0 sbbc 1 shbbc connects to SDC, PROCs)redxl> shsdc 0 0 (show SDC. 0 0 = exb 0 slot0 shsdc connects to SBBC, PROCs)

I/O(slot1) redx commands:redxl> shioc 0 1 1 (show IOC. 0 1 0= exb0 slot1 ioc 1 shioc connects to SDC, DXs, AR)redxl> shar 0 1 (show AR. 0 1 = exb 0 slot1 shar connects to AQX(exb) SDI 0 (exb) IOCs)redxl> shdx 0 1 1 (show DX. 0 0 1= exb 0 slot1 dx 1 shdx connects to SDI(exb) IOCs)redxl> shsdc 0 1 (show SDC. 0 1 = exb 0 slot1 shsdc connects to SBBC, IOCs)redxl> shbbc 0 1 (show SBBC. 0 0 1 = exb 0 slot1 shbbc connects to SDC, IOCs)

Expander (exb) redx commands:redxl> shaxq 0 (show AXQ. 0 = exb 0 shaxq connects to AMXs(cp) ARs, SDCs, SDI 0)redxl> shcbr axq 0 (show CBR AXQ. 0 = exb 0 ) redxl> shsdi 0 0 (show SDI. 0 0 = exb 0 sdi 0 shsdi connects to DARBs (cp) DMXs(cp) ARs

SDCs, SDIs(exb) AXQ(exb) (6 SDIs/exb)redxl> shcbr exb 0 (show CBR EXB. 0 = exb 0)

CenterPlane (cp) redx commands:redxl> shamx 0 1 (show AMX. 0 1 = cp 0 amx 1 shamx connects to AXQs (exbs)redxl> shrmx 1 (show RMX. 1 = cp 1 shrmx connects to AXQs (exbs)redxl> shdmx 0 (show DMX. 0 = cp 0 shdmx connects to SDIs (exbs) port 0-3, 1-2, 2-1, 3-0, 4-4, 5-5redxl> shdarb 1 (show DARB. 1 = cp 1 shdarb connects to SDI 0 (exbs) shows domain configs)

Terms:AR Address Repeater (1 per SB, IO, max CPU)AMX Address MultipleXer (2 per centerplane buss C0, C1)AXQ Address controller (1 per expander board)DARB Data ARBiter (1 per centerplane buss C0,C1)DCDS Dual CPU Data Switch (2 per SB, 1 per Max CPU. 1/DCDS for 2 PROCs)DMX Data MultipleXer (6 per centerplane bussC0,C1 connects to SDI exbs)DX Data Switch (4 per slot0, 2 per slot1 bd)RMX Response MultipleXer (1 per centerplane buss C0,C1)SBBC System Boot Bus Controller (2 per slot0, 1 per slot1 bd)SDC System Data path Controller (1 per SB, IO, max CPU)SDI System Data Interface (6 per EXB, 0 is master connects to DMXs)

Page 89

Page 95: Solaris OBP Reference Guide

StorADE:

Has diagnostics included in it that are supposed to replace Storetools.Alot of the new arrays and fiber channel backplanes are supported.

You can bring up the GUI by typing (in a browser window, any server): http:// hostname :7654 (default login: ras password: agent)(I found cli diags to be more useful then the GUI) New cli storage diagnostics located in : /opt/SUNWstade/Diags/binlisting below.

6120test -tests the functionality of disks in a 6120 array (minnow) a5ktest - tests the functionality of disks in the Sun StorEdge A5000 and A5200 array a5ksestest - tests Sun StorEdge A5000 and A5200 arrays a3500fctest - verifies functionality of Sun StorEdge A3500FC disk tray brocadetest - diagnose Brocade Silkworm Fibre Channel switches d2disktest - tests the functionality of the Internal Sun StorEdge D2 Array disk daksestest - tests Sun Fire V880 FC-AL disk backplanes daktest - tests the Sun Fire V880 FC-AL disk dex - Device Exerciser for Sun StorEdge arrays discman - discovery manager disk_inquiry - disk-only version of the inquiry program

disktest - No manual entry enc_inquiry - No manual entry

fcdisktest - tests the functionality of internal fibre channel disk fctapetest - tests the functionality of Fibre Channel tape drives ifptest - tests functionality of the PCI FC-100 Fibre Channel-Arbitrated loops (FC-AL) card lbf - A loop back frame diagnostic utility program that tests Fibre Channel-Arbitrated loops (FC-AL) linktest - diagnose Sun StorEdge network passive Fibre Channel components

linktest2 - No manual entry .ofdg - No manual entry ondg - No manual entry

qlc_hba - displays stats on qlc hba qlctest - tests the functions of the 1gb and 2 gb PCI and cPCI Fibre Channel Network Adapter boards. socaltest - tests the SOC+ host adapter card stresstest - Checks for possible SAN errors. switchtest - diagnose Sun StorEdge Network Fibre Channel switch-8 and switch-16 switches t3test - tests the functionality of the Sun StorEdge T3 and T3+ array LUNs vediag - Runs virtualization engine diagnostics through SLICD veluntest - tests the functionality of the virtualization engine by accessing the VLUNs.

volverify - No manual entry

Get fru info from a serengetti: (prtfru does not work on serengetti, explorer must be loaded)

#cd /opt/SUNWexplo/bin # LD_LIBRARY_PATH=/opt/SUNWexplo/lib # export LD_LIBRARY_PATH # CLASSPATH=/opt/SUNWexplo/java/fruid-scappclient.jar:/opt/SUNWexplo/java/libfru.jar # export CLASSPATH # ./rprtfru.sparc -b sc_ip_address:password >/tmp/fruid(must use password. will put output in file /tmp/fruid)

Page 90

Page 96: Solaris OBP Reference Guide

SWAP

What is recommended now (2003) swap size with gb physical memory servers?(http://docs.sun.com/db/doc/817-0798/6mgisnqfi?a=view)

System Type Swap Space Size Dedicated Dump Device Size Workstation 4 Gb of physical memory 1 Gbyte 1 Gbyte Mid-range server 8 Gb of physical memory 2 Gbytes 2 Gbytes High-end server 16 to 128 Gb of physical memory 4 Gbytes 4 Gbytes

Performance considerations: How much and how often?

# swap -s (command to monitor swap resources) # swap -l (command to determine if your system needs more swap space)

How do you get an estimate of needed swap/app?# pmap -r pid# (sol 8, 9) (shows heap used/process. Add up heap to get an idea)# pmap -Sa pid# (sol 9) (will show all reservations by each process)

How to tell how much swapping? (if too much should consider adding more physical memory)# vmstat 5 5 (look at sr column, also note po, page out column. non-zero numbers

- page scanner looking for pages to mark as free, po - we're sending stuff out.)# iostat -npxc 5 5 (check for kw/s on the swap partition - non-zero and the page outs from

vmstat are really writes to swap partition(s).(http://docs.sun.com/db/doc/816-4553/6maop1hik?a=view)

Dump considerations:How much memory do you want dumped? all, kernel, kernel + active process

# dumpadm Dump content: kernel pages

Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/crash/pluto ***(large enough to hold core)

Savecore enabled: yes

# dumpadm -c all -d /dev/dsk/c0t1d0s1 -m 10% Dump content: all pages Dump device: /dev/dsk/c0t1d0s1 (dedicated) Savecore directory: /var/crash/pluto (minfree = 77071KB) Savecore enabled: yes

savecore -L (live core dump, WATCH OUT, do not do a savecore -L to a dumpslot under volume manager control)

DR considerations:How much physical memory on most populated System board?Nonpermanent Memory (currently 32gb physical mem/max/bd) Before you can deletea board, the environment must vacate the memory on that board.Vacating a board means flushing its nonpermanent memory to swap space.

http://education.central/AliasArchive/Archives/ILT/ses_systemadmin-ext/msg08612.htmlhttp://education.central/AliasArchive/Archives/ILT/ses_systemadmin-ext/msg05509.html

Page 91

Page 97: Solaris OBP Reference Guide

from /net/cores.central/cores/dir5/(REAL DATA: looked at explorer for ram size and explaned core to check size)

RAM Core size type Solaris24gb 1.7gb k 820gb 984mb k 818gb 1gb k 816gb 900mb k 816gb 884mb k 816gb 2.4gb k 810gb 800mb k 88gb 1.2gb k 86gb 518mb k 2.64gb 300mb k 84gb 594mb k 84gb 305mb k 2.64gb 435mb k 74gb 2mb a 82gb 155mb k 2.62gb 243mb k 82gb 234mb k2gb 374mb k 82gb 263mb k 71.5gb 220mb k 81gb 997mb k 71gb 138mb k 8

Maserati Notes- StorEdge 6320 and 6120:

Two models: 6120- standalone, desk side or rack, like T3 WG or PP. 6320- rack solution like the 3900 (Indy),includes service processor, management net. Next generation T3, just don't call it the T4. Very much like the T3. Drives in front, two power supplies in back on top, one controller, two loop cards. Components are similar to the T3 but are physically enclosed differently, not swappable between T3 and T4. Units are 3U high. On back, controller in middle, loop cards on each side. Loop cables are different (use RJ-45 type connector). All fiber connections use the LC style connector.

Arrays are 2GB capable on the front end using the Qlogic 2300 chipset. Internally run at 1GB using the Qlogic 2200 chipset.

Model marketing designation is a 'YxZ' config: where Y=# of controllers, Z= # of trays.Each controller can have 1 to 3 disk trays associated with it. One tray will have the controller in it, the other 2 will have no controller. Trays are joined via the loop cards. Min config a 1x1 (1 controler, 1 box), max config is 2x6 (2 controllers, 6 boxes). Controller redudnancy is done thru a partner pair type config, just like the T3except with the expansion trays factored in.Up to 14 drives per tray, 7 is minimum supported number (though only 4 will work). Drive slot 14 is the hotspare location. Like T3 don't have to have a spare, but if you do it must be slot 14. Drive sizes are 36GB, 73GBand 146GB drives.

All commands are the same as a T3 with 2.1 and above firmware. Max luns per array is 64, max luns per volume is 32. Each tray is still limited to two volumes, using contiguous disks. If you have min config (7 disks) and build two volumes, you will need to remove/create a volume to add more disks.

Note- internally brick terminology is the same as T3 (volslice, volume). Although, maserati manuals refer to them as pools (volumes on T3) and volumes (volsices on T3).

Page 92

Page 98: Solaris OBP Reference Guide

Maserati Notes- StorEdge 6320 and 6120: cont.

6120 LED indicators:Green- NormalYellow- Service action requiredBlue- Safe to remove (hot swap)White- Locator beacon

6320 -rack has a V100 service processor, an integrated patch panel and a SPAT (service processor accessory tray). V100 has cdrom, optional usb flash memory card to save config. Patch panel consolidates connections for

service components and fiber connections. SPAT has a 4 port terminal concentrator (NTC) with a built inmodem, a firewall/router, an ethernet hub and future usb power management sequencer. (Customers areencouraged to use remote services by Sun thru the provided modem. During initial release of the product 5/03 thru 9/03 install is free.)

FC switches may be mounted in the rack but are no longer monitored or controlled via the SP.

6320 has 3 LANs set up:I internal- for components only

SP LAN- remote services net (behind the firewall)User LAN- one customer net port.

6230 default logins and passwords and roles:Service Processor root/!rootFirewall root/sun1 user firewall accessNTC rss/sun1rss NTC userNTC su/sun1rss NTC admin6120 array root/!root array admin

GUI passwordsconfig service admin/!admin full access

storage/!storage storage set up onlyguest/!guest observe only

Login to sp from external system using sshssh -l root <ip> (sp does not have menu to make changes to config, like 3900 Indy)

Use web based GUIhttps://<ip>:9443/se6000ui/login.do (GUI is similar to storade)

Use sccs CLI: ( from external system with packages installed.) Commands located in /opt/se6x20/cli/bin

sscs login, sscs list, sscs add, sscs create, sscs modify, sscs delete.

Flash Archive interactive install: (saves time on multi domain installs)(see info doc 40131)

Create a flash image from a patched server: (load patches and packages before creating image) # cd / # flarcreate -S -n image_name /path_ to/ image_file (~2.2gb - can use -c compress, 2x longer, only 1/5 smaller) # share –F nfs -o ro,anon=0 /path_ to/ image_file (share image file) (/etc/init.d/nfs.server start)Boot new server and load from image: (if boot from CD best to use same release as flash ex :sol9 04/04)

(note: you need network connectivity btwn image server and new server to download image)- On server to be loaded: boot cdrom or boot net (if you have created a install server or 12/15K) - Answer all install questions until you get to “F2 Standard” “F4 Flash” select “F4”- Select NFS - NFS Location: ip_address:/path_ to/ image_file (ex: 192.148.220.113:/var/tmp/flash )- Continue answering install questions as you would on a regular interactive install- Server will load Solaris from the image you specified/created

Page 93

Page 99: Solaris OBP Reference Guide

UltraSPARC III CPU Diagnostic Monitor (CDM): ( see Sun Alert ID: 55081 )

CDM is supported only on UltraSparc-III processors based platforms with Solaris 8 or Solaris 9 releases. CDM contains 3 packages with total size less than 1MB.

To download packages: http://diagnostics.sfbay/cdm/

EIS-CD 29JUL03 will also have packages on it

Download consists of three Sun Packages: Install orderSUNWcdiam 3SUNWcdiar 2SUNWcdiax 1

To start CDM, add packages and boot server. Will run at `default' settings without modifications to /etc/cpudiagd.conf. To change settings modify /etc/cpudiagd.conf. See cpudiagd man pages for log filesand config info.

To remove CDM :# /etc/init.d/cpudiag stop# pkgrm SUNWcdiam SUNWcdiar SUNWcdiax

(note: log files in /var/cpudiag/log/ remain after CDM is removed)

SunFire Service Mode Password Generator: (for info see http://acts.ebay/bulletins/index.cgi?bulletin=159)

Generator url: https://sfservicepass.sfbay/

(Generator will ask for hostid of main SC, ScApp version, RTOS version. If you type 'service' (return, return)in the platform shell the SC will list the needed info)

To enter service mode type 'service' and enter password in the platform shell.To exit service mode type 'service'ex: setchs -s ok, suspect, faulty -r "reason for status" -c /N0/SB2/p2

V440 : (Chalupa) Solaris 8 7/03 beta

ALOM: ('#.' to enter, default login admin admin1)poweron power on server, fru. Turns off ok-2-remove ledpoweroff power off serverremovefru will move a FRU into a state whereby it is ready to be removedreset resets the managed systembreak causes the SC to send a break to the managed system OS bootmode provides control over the OBP firmware behavior during system initializationconsole connect this user session to the managed system's OS console stream #. to returnconsolehistory displays the contents of the selected OS console output buffershowlogs displays the contents of the managed system eventlogsetlocator cause SC to turn the managed system locator indicator on or offshowlocator display the managed system locator indicator current stateshowenvironment displays the environmental status to the SC for the managed system.Showfru prints out the FRUID data stored in the FRU PROMshowplatform displays the hardware configuration of the platform showsc displays the details of the SC software configuration and firmware version information.

Page 94

Page 100: Solaris OBP Reference Guide

shownetwork displays the current SC network configuration parameterssetsc allows the user to individually configure SC parameterssetupsc interactivly configures the SC parametersshowdate displays the current SC date and timesetdate allows the user to set the current SC date and timeresetsc resets the SC flashupdate download a new firmware image to the active SCsetdefaults set all the user settable SC configuration parameters to their default value useradd add a new user to the SCs user databaseuserdel remove an existing user from the SCs user databaseusershow displays the configuration details for a user account, or all accounts (w/o argument)userpassword allows an administrator to set/change a users passworduserperm sets the permissions for the specified userpassword allows a user to change their own login passwordshowusers display a list of users currently logged into the SClogout logs the current user out from his alom sessionhelp [command] provides assistance to the user of the CLI by listing the commands

raidctl: solaris command ( V440 hardware raid command, mirror within controler only)

raidctl -h Help text, no man pages raidctl -c Create mirror (note: raid volume will use original disks ctd#) ex: raidctl -c c1t1d0 c1t2d0 raidctl -d Delete mirror ex: raidctl -d c1t1d0 raidctl [-f] Update controler firmware ex: raidctl -F image 1 raidctl -l List raid controller status ex: raidctl -l 1

Adding Locales to Solaris: (S8 see infodoc 44626, S7 infodoc 44505 )There are 3 ways to add locales to a server.

Initial install select locales while installingUpgrade select locales while Upgradingpkgadd pkgadd from Solaris Media kit Languages CD (about 100meg/locale)

(/cdrom/Sol_8_1001_lang_sparc/components/<product>)

Finding Solaris release and distribution loaded:

# more /etc/release (to find the Solaris version loaded)# more /var/sadm/system/admin/CLUSTER (to find the distribution loaded)

SUNWCXall - Full Distribution + OEM Support SUNWCall - Full Distribution SUNWCprog - Developer SUNWCuser - End User SUNWCreq - Core

Find local NIS servers (see infodoc:4736)

% rpcinfo -b ypserv 2 (systems that respond are running ypserv, and thus NIS servers) Are they serving your NIS domain? % yppoll -h responding_server passwd.byname

Page 95

Page 101: Solaris OBP Reference Guide

Network troubleshooting: Commands:

arp -a display entries in the arp tabledmesg check status of interface at boot timeifconfig allows you to add/modify/delete interface parameters (see page 48,75)kstat -n interface kernal stats for interface (good info)kstat -p kstat -p | grep interface gives speed and duplex informationndd -set /dev/eri instance 0 sets view to eri0ndd /dev/eri \? shows what eri paramaters are modifiablendd -get /dev/tcp tcp_status displays tcp parameter value 'tcp_status' also ndd -get /dev/eri link_statusnetstat -i gives you interface details # of packets, collisions, errors ect... netstat -Pn protocol protocol info, no name resolution netstat -rnv routing info, no name resolution, local veiw netstat -k interface same info as kstat -p but not well formattedping 192.168.47.2 command contacts and reports status of 192.168.47.2 rup 192.168.47.2 contacts and reports up time for 192.168.47.2 route (add, get, flush, delete) command allows you to add, get, delete, flush, entries in the routing tablesnoop monitors network traffic use -v ,-d ,interface, ipaddress to filter viewspray 192.168.47.2 will send packets to 192.168.47.2 report on transfer rate and number receivedtraceroute 192.168.47.2 maps and times route from your server to 192.168.47.2

Files:

/etc/defaultdomain - servers domain name/etc/dhcp.interface - touch file for dhcp boot ex: /etc/dhcp.hme0 (hme0 will boot dhcp) /etc/hosts - list of hosts (local file) is linked to /etc/inet/hosts/etc/hosts.equiv - trusted remote hosts and users/etc/hostname.xxx - contains interface name and/or config at boot time/etc/protocols - contains protocol names configured and psudo number/etc/services - contains services configured and default port number/etc/notrouter - touch file if server has multiple interfaces and should NOT route/etc/defaultrouter - contains ip address of servers router (needed to reach other subnets)/etc/gateways - file contains static route entries/etc/ftpusers - contains a list of users that can NOT ftp login (Solaris 8 and 9)/etc/ftpd/ftpusers - contains a list of users that can NOT ftp login (Solaris 9)/etc/netconfig - network config File/etc/nsswitch.conf - contains config of named services on server/etc/netmasks - contains a list of base addresses and netmasks.rhosts - trusted remote hosts and users

Daemons:

dhcpagent - implements client half of the DHCPin.dhcpd - dhcp daemon run with the -d -v switch for diagnostic outputin.ftpd - in.ftpd is the Internet FTP server process.in.mpathd - IPMP process. Started by the 'group' option of ifconfig commandin.routed - the routing daemon (only present on router servers) -s -qin.rdisc - implements the ICMP router discovery protocolin.telnetd - in.telnetd is a server that supports TELNET virtual terminal protocolxntpd - ntp daemon

Page 96

Page 102: Solaris OBP Reference Guide

How to find your way around a B1600... (min O/S Sol8 12/02, Sol9 04/03)Default login sc: admin:no psswd sw: admin:admin

SC commands:console console connection to switch or blade (use showplatform name. #. to return)help lists available commandsshowplatform -v platform and blade config and status informationsetupsc initial sc setup...showsc lists config data provided to setupsc commandpoweroff s# Poweroff blade number s# (console to blade & shutdown first)poweron s# Poweron blade number s#

SW commands:help lists available commands? command ? will list available syntaxshow vlan listing and ports assigned to vlansshow running-config current switch configurationshow startup-config Config used at boot time show mac-address-table mac addresses learned by portsshow system platform wide config informationshow interface Shows status/config of selected interfaceshow spanning-tree displays spanning-tree info

Sun Blade Management GUI: http:// switch_IP_address:80 (ipaddress # from 'show running-config' command 'show system' for port address)

switch ports:

NETPn ports are external uplink switch ports. There is no correlation of NETPn port to blade number. SNPn ports are internal downlink switch ports that are connected to the blades ce interfaces.

There is a 1 to 1 correlation of SNPn port to blade number ( ce0 to ssc0/swt, ce1 to ssc1/swt) Setting up Vlans:

Vlans are assigned to ports and can be designated as tagged or untagged. A tagged vlan isone that uses tagged communication to a vlan aware interface. A untagged vlan passes all untagged traffic. Ports that have the same vlan assigned to it can communicate together.

The formula for determining a Solaris interface number for a tagged vlan (VID) is:1000 * VID + device PPA = Vlan logical PPA

vlan 15 on ce0 : 1000 * 15 + 0 (for ce0) = ce15000 vlan 15 on ce1 : 1000 * 15 +1 (for ce1) = ce15001

Ex: to assign blade s0 and blade s1 interface ce0 to vlan 15 you would do thefollowing:

on S0 and S1:# ifconfig ce15000 plumb# ifconfig ce15000 inet ip_address netmask + broadcast + upcreate/add hostname to /etc/hostname.ce15000add ip_address (es) and hostanmes to /etc/hosts

on switch:Console# config Console(config)#vlan databaseConsole(config-vlan)#vlan 15 name VLAN15 media ethernetConsole(config-vlan)#endConsole#config Console(config)#interface ethernet SNP0 (s0 ce0 is connected to SNP0 port)

(continued on next page)Page 97

Page 103: Solaris OBP Reference Guide

b1600 cont...

Console(config-if)#switchport allowed vlan add 15 taggedConsole(config-if)#endConsole#Console#config Console(config)#interface ethernet SNP1(s1 ce0 is connected to SNP1 port)Console(config-if)#switchport allowed vlan add 15 taggedConsole(config-if)#endConsole#

(you would follow the same procedures if creating untagged vlans only the interface would remaince0 and the switch command would not have 'tagged' at the end. ALSO: if you want the vlan tobe seen outside the chassis you must allow it on a external port NETPn)

Trunking: (ports grouped together to act as one) to create a static trunk (external ports NETP3 and NETP3 are put into trunk2): ports must be connected to a static trunk on another switch.

Console#configConsole(config)#interface port-channel 2Console(config-if)#exitConsole(config)#interface ethernet netp2Console(config-if)#channel-group 2Console(config-if)#exitConsole(config)#interface ethernet netp3Console(config-if)#channel-group 2Console(config-if)#endConsole#show interface status port-channel 2

to create LACP (link aggregation connection protocol) trunk: ports must be connected to a LACP- enabled trunk ports on another switch

Console(config)#interface ethernet netp4Console(config-if)#lacpConsole(config-if)#exitConsole(config)#interface ethernet netp5Console(config-if)#lacpConsole(config-if)#exit

(The trunk is automatically activated if LACP is enabled on the connected port of thetarget switch. A trunk formed with another switch using LACP is automatically assigned thenext available trunk ID)

Spanning tree:

Where two bridges are used to connect the same two computer network segments, a spanningtree configuration occurs. Because spanning trees have multiple paths to the same destination,a condition called 'bridge loop' is created. 'Spanning tree protocol' is communications betweenbridges designed to eliminate the loop path. Caution should be used if you are configuring theswitch for spanning tree protocol. In that it will effect switches in the customers network.

Page 98

Page 104: Solaris OBP Reference Guide

b1600 cont...

Full list of commands:

sc commands:bootmode reset_nvram|diag|skip_diag| normal|bootscript= string sn {sn} This command allows you to specify a

boot mode for a blade. You need to use it to boot Linux blades for the first time break -y s# Command causes blade to drop from Solaris into either kadb or OBPconsole -f -r Access console of a switch or blade. (ssc#/swt,s#) type #. to return to the sc> prompconsolehistory -b -e -g Displays the contents of the switch or blade consoles buffer. (boot|run ssc#/swt|s#)flashupdate -s IPaddress -f path -v ssc# s# Enables you to upgrade firmware to a System Controller or to a bladehelp [command] Provides help text for specified commandlogout password command allows a user to change his or her own password poweroff -f -y -s -r Powers off components (ch,ssc#,s#) poweron -f -y -s -r Powers on components. (ch,ssc#,s#)removefru -f -y Powers down components (ch,ssc#,s#) reset -y -x Resets components (s#,ssc#/swt,ssc#/sc,ssc#) resetsc -y Resets the active System Controller. setdate set the time of day on the System Controller, switches, and server blades.setdefaults -y Returns the active System Controller (but not its switch) to the factory default settings.setfailover Tells you which System Controller is the active and standby System Controller. setlocator on off Turn on/of blade locatorsetupsc Enables you to configure the active System Controller interactively. showdate Displays the current date and time showenvironment -v Displays environmental sensors status in components of the chassis. (ssc#,psn,s#)showfru Displays the contents of component (s) FRUID database (ssc#,s#,ch,psn)showlocator Tells you whether the locator LED is on or off.showlogs -b -e -g -v Displays the events (s#, ssc#) showplatform -v -p Displays the status of each component. (ssc#,ssc#/swt,psn,s#,ch) showsc [-v] Displays a summary of the configuration of the active System Controller.showusers Shows the users currently logged into the System Controller. standbyfru -f -y Powers down components (ch, ssc#, s#) u Gives user administration privileges useradd username Adds a named user to the list of permitted System Controller users. userdel username Deletes a user from the list of permitted System Controller users. userpassword username allows a user with a-level permissions to alter another users password. userperm username aucr specifies the named users permission levels. usershow username Shows details of the specified users login account. switch comands: (use ? and help commands for assistance)

switch Exec commands:

clear counters Clears statistics on an interface logging Clears messages from the logging buffer mac-addresstable dynamic Removes any learned entries from the forwarding database config Activates global configuration modecopy Copies a code image or a switch configuration to or from Flash memory or a TFTP server file Copy from file system running-config Copy from current system configuration startup-config Copy from startup configuration tftp Copy from tftp server

Page 99

Page 105: Solaris OBP Reference Guide

b1600 cont...

debug Debugging functionsdelete Deletes a file or code image dir Displays a list of files in Flash memory disable Returns to normal mode from privileged mode exit Returns to the previous configuration mode, or exits the CLI flowcontrol Enables flow control on a given interface garp timer Sets the GARP timer for the selected function help Description of the interactive help system? Shows options for command completion (context sensitive) hostname Specifies or modifies the host name for the device ip dhcp restart Submits a BOOTP or DHCP client requestlogin Enables password checking at login password Specifies a password on a line password-thresh Sets the password intrusion threshold, which limits the number of failed logon attempts ping Sends ICMP echo request packets to another node on the network port monitor Configures a mirror session security Configures a secure port IC quit Exits a CLI session reload Restarts the system show bridge-ext Shows bridge extension configuration bridge multicast Shows the IGMP snooping MAC multicast list gvrp configuration Displays GVRP configuration for selected interface garp timer Shows the GARP timer for the selected function interfaces status Displays status for the specified interface port-channel Shows information about a particular aggregated link. vlan Displays status for the specified VLAN interface counters Displays statistics for the specified interface switchport Displays the administrative and operational status of an interface ip interface Displays the IP settings for this device redirects Displays the default gateway configured for this device filter Displays filter rules or captured packets igmp snooping Shows the IGMP snooping configuration mrouter Shows multicast router ports line Displays a terminal line's parameters logging Displays the state of logging mac-addresstable Displays entries in the bridge-forwarding database aging-time Shows the aging time for the address table map ip precedence Shows the IP precedence map dscp Shows the IP DSCP map port monitor Shows the configuration for a mirror port queue bandwidth Shows round-robin weights assigned to the priority queues cos-map Shows the class-of-service map radius-server Shows the current RADIUS settings running-config Displays the configuration data currently in use snmp Displays the status of spanning-tree Shows the spanning tree configuration startup-config Displays the contents of the start up configuration system Displays system information tacacs-server Shows the current TACACS settings users Shows all active console and Telnet sessions, version Displays version information for the system

Page 100

Page 106: Solaris OBP Reference Guide

b1600 cont...

vlan Shows VLAN information shutdown Disables an interface silent-time time the management console is inaccessible after unsuccessfullogon attempts exceededspanning-tree protocol-migration Re-checks the appropriate BPDU format whichboot Displays the files booted switch Configure commands:

authentication login Defines logon authentication method and precedence boot system Specifies the file or image used to start up the system bridge-ext gvrp Enables GVRP globally for the switch capabilities Advertises the capabilities of a given interface for use in auto-negotiation channel-group Adds a port to an aggregated link description Adds a description to an interface configuration enable [level] Use this command to activate Privileged Exec mode. password Sets a password to control access to the Privileged Exec levelend Returns to Privileged Exec modeexec-timeout Sets the interval that the command interpreter waits until user input is detected exit Exit from global configure modehelp Description of the interactive help system hostname Specifies or modifies the host name for the device interface Configures an interface type and enters interface configuration mode ethernet Ethernet IEEE 802.3 portchannel Configures an aggregated link and interface configuration mode for the aggregated link vlan Enters interface configuration mode for a specified VLAN ip filter Blocks specified IP packets from entering the internal management port (NETMGT) http port Specifies the port to be used by the Web browser interface server Allows the switch to be monitored or configured from a browser address Command to set the IP address for this device dhcp restart Submits a BOOTP or DHCP client request client-identifier Specifies the DHCP client identifier for the switch default-gateway Defines the default gateway igmp snooping Enables IGMP snooping vlan static Adds an interface as a member of a multicast gro up version Configures the IGMP version for snooping querier Allows this device to act as the querier for IGMP snooping query-count Configures the query count query-max-responsetime Configures the report delay router-port-expiretime Configures the query timeout vlan mrouter Adds a multicast router portjumbo-frame Enables support for jumbo frames lacp Configures LACP for the current interface IC 4-168line Identifies a specific line for configuration and starts the line configuration mode logging on Controls logging of error messages history Limits syslog messages saved to switch memory based on severity mac-address-table aging-time Sets the aging time of the address table static Maps a static address to a port in a VLAN map ip precedence Enables IP precedence class-of-service mapping map ip precedence Maps IP precedence value to a class of service map ip dscp Enables IP DSCP class-of-service mapping map ip dscp Maps IP DSCP value to a class of service

Page 101

Page 107: Solaris OBP Reference Guide

b1600 cont...

negotiation Enables auto-negotiation of a given interface no Negate a command or set its defaultsqueue bandwidth Assigns round-robin weights to the priority queues queue cos map Assigns class-of-service values to the priority queues radius-server host Specifies the RADIUS server port Sets the RADIUS server network port key Sets the RADIUS encryption key retransmit Sets the number of retries

timeout Sets the interval between sending authentication requests snmp-server contact Sets the system contact string location Sets the system location string host Specifies the recipient of an SNMP notification operation enable traps Enables the device to send SNMP traps (SNMP notifications) spanning-tree Enables the spanning tree protocol spanning-tree mode Configures STP or RSTP mode forward-time Configures the spanning tree bridge forward time hello-time Configures the spanning tree bridge hello time maxage Configures the spanning tree bridge maximum age priority Configures the spanning tree bridge priority pathcost method Configures the path cost method for RSTP transmission-limit Configures the transmission limit for RSTP cost Configures the spanning tree path cost of an interface portpriority Configures the spanning tree priority of an interface edgeport Enables fast forwarding for edge ports IC linktype Configures the link type for RSTPspeed-duplex Configures the speed and duplex operation of a given interface switchport broadcast packetrate Configures the broadcast storm control threshold mode Configures VLAN membership mode for an interface acceptable-frame-types Configures frame types to be accepted by an interface ingress-filtering Enables ingress filtering on an interface native vlan Configures the PVID (native VLAN) of an interface allowed vlan Configures the VLANs associated with an interface gvrp Enables GVRP for an interface forbidden vlan Configures forbidden VLANs for an interface gvrp Enables GVRP for an interface forbidden vlan Configures forbidden VLANs for an interface priority default Sets a port priority for incoming untagged framestacacs-server host Specifies the TACACS server port Sets the TACACS server network port key Sets the TACACS encryption key username Establish User Name Authenticationvlan database Enters VLAN database mode to add, change, and delete VLANs vlan Configures a VLAN, including VID, name and state

Page 102

Page 108: Solaris OBP Reference Guide

Cluster 3.x: http://suncluster.eng http://cluster.central (Installation Information)

Introduction: Sun Cluster 3 is the first integrated release of Sun's next generation Full Moon clustering technology. Sun Cluster 3 extends Solaris with the Full Moon cluster framework, enabling the use of core Solaris services such as file systems, devices, and networks seamlessly across a tightly coupled cluster and maintaining full Solaris compatibility for existing applications.

Key Benefits: Higher / Near continuous availability of existing applications based on Solaris services such as highly available file system and network services. Integrates/extends the benefits of Solaris scalability to dotCOM application architectures by providing scalable and available file and network services for horizontal applications. Ease of management of the cluster platform by presenting a simple unified management view of shared system resources.

General: Configuration guide is located at suncluster.eng. All Information is too much to show here. Below are some highlights.

Up to 8 nodes in a cluster including single node clusters. Sun and EMC storage supported with others starting in May 04. Failover, Scalable and OPS/RAC Services Supports Solaris 8 and 9 PNM is supported for 3.x and IPMP for 3.1 for Public net. Supports QFE, Gigabit, Wildcat and SCI for Private net. Supports different types of server nodes in the cluster. DMM not supported. Have to use STMS or Powerpath which overides it. Terminal concentrator isn't mandatory. Can use RSC or system controllers.

Admin w/s: Admin Workstation not mandatory. Management GUI is now web based. Good to install Sun Console software on Sun machine to have access to double window GUI.

Server Requires end user distribution. However Server Storage and some Software may require more. Best to at least install Full distribution.

Topologies Clustered Pair N+1 Pair + N N to N scalable Diskless Cluster Single-node Cluster

Hardware Notes: Must change the initiator id on one node if using SCSI arrays between 2 nodes See info Doc 20704 for scsi initiator change procedure. When a disk is replaced, the cluster needs to be made aware through the scdidadm command.

Page 103

Page 109: Solaris OBP Reference Guide

Cluster 3.x: (cont...) Wiring Diagrams - See the configuration guide on internal site: suncluster.eng.

Commands:

boot -x Bring server up w/o cluster ccp Used to run the cluster control panel software #ccp clustername scstat Used to get a status of the whole or part of the cluster. -D Shows status for all disk device groups. -g Shows status for all resource groups. -i Shows status for all IP Network Multipathing groups. -n Shows status for all nodes. -p Shows status for all components in the cluster. Use with -v[v] to display more verbose

output. -q Shows status for all device quorums and node quorums. -v[v] Shows verbose output. -W Shows status for cluster transport path. scrgadm manage registration and unregistration of resource types, resource groups, and resources Show Current Configuration: -pv [v] -t resource_type_name -g resource_group_name -j resource_name Resource Type Commands: (add, change, remove) -a -t resource_type_name -h RT_installed_node_list -f registration_file_path -c -t resource_type_name -h RT_installed_node_list -r -t resource_type_name Resource Group Commands: (add, change, remove) -a -g RG_name -h nodelist -y property -c -g RG_name -h nodelist -y property -y property -r -g RG_name Resource Commands: (add, change, remove) -a -j resource_name -t resource_type_name -g RG_name -y property -x extension_property -c -j resource_name -y property -x extension_property -r -j resource_name Logical Host Name Resource Commands: (add) -a -L -g RG_name -j resource_name -l hostnamelist -n netiflist -y property Shared Address Resource Commands: (add) -a -S -g RG_name -l hostnamelist -j resource_name -n netiflist -X auxnodelist -y property scconf Update the cluster software configuration. Recommend running scsetup and this will print out the

scconf command used. Therefore remember and use the commands you use repetitively. -pv[v] Prints out the configuration. scinstall Install Sun Cluster software and initialize new cluster nodes. -pv[v] Print out packages and versions installed. scsetup Interactive cluster configuration tool similar to vxdiskadm in Veritas. scdidadm The scdidadm utility administers the device identifier (DID) pseudo device driver did -C Removes references to nonexistent devices on the cluster nodes. -l Lists the local devices in the DID configuration file. -L Lists all the paths, including those on remote hosts, of the devices in the DID config file.

-r Reconfigures the database. Page 104 -R Performs a repair procedure on a particular device instance.

Page 110: Solaris OBP Reference Guide

Cluster 3.x: (cont...) Commands:

scshutdown Shut down a cluster scvxinstall The scvxinstall utility provides automatic VxVM installation and optional root-disk encapsulation

for Sun Cluster nodes. scgdevs Global devices namespace administration script scswitch Perform ownership and state change of resource groups and disk device groups in Sun Cluster

configurations. Below are some examples: Misc Procedures:

Device Groups: Register a new disk group:

scconf -a -D type=vxvm,name=new_disk_group,nodelist=nodex:nodex Sync device group info after adding a volume:

scconf -c -D name=diskgroup,syncGetting registered device group information:

scstat -DSwitch a device group off a node:

scswitch -z -D device_group -h node Switch a device group offline (must be quiescent and unmounted)

scswitch -F -D device_groupSwitch a device group into maintenance state (must be quiescent and unmounted)

scswitch -m -D device_groupSwitch a device group online:

scswitch -z -D device_group -h nodeResource Groups:

Get current resource group status:scstat -g

Switch a resource group to another node: scswitch -z -g resource_group -h node

Switch all resource and device groups off a node:scswitch -S -h node

Take a resource group offline on all nodes: scswitch -F -g resource_group

Bring a resource group online on all nodes: scswitch -Z -g resource_group

View configured resource groups:scrgadm -p[v][v]

Removing a resource group: Before a resource group may be removed, all resources within the group must be removed. The steps required are:

1) take the resource group offlinescswitch -F -g resource_group

2) disable the resources within the group scswitch -n -j name_of_resource

3) remove the resources within the group scrgadm -r -j name_of_resource

4) remove the resource group scrgadm -r -g resource_group

Page 105

Page 111: Solaris OBP Reference Guide

SMS upgrade 1.4.1: (see SMS 1.4.1 install guide http://www.sun.com/servers/highend/sms.html)

Download your SMS packages: http://www.sun.com/servers/highend/sms.html (make sure to run cksum and compare)(also on EIS CD3 starting Apr-27-04)

- unzip file and note location

Prepare for Upgrade: - switch user to sms-svc - Make sure SCs are stable, no data syncs, DR, hw changes in progress - Turn off failover on main SC (SC0) sc0:sms-svc:>setfailover off - Stop SMS on the spare SC (SC1) sc1:#/etc/init.d/sms stop - Backup SMS on spare (optional) sc1:#smsbackup (can add UFS dest dir. default: /var/tmp)

Upgrade Solaris Operating environment (optional) sms 1.4.1 will work with sol8 and sol9. There is a different SMS package for each O/S version. Sol8 02/02 Sol9 04/04 (if you upgrade O/S add all patches and reboot. stop sms again if rebooted)

Upgrade SMS software packages using smsupgrade: (spare sc first SC1) - cd to download directory sc1:# cd /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Tools - smsupgrade sc1:# ./smsupgrade /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Product

Switch control to spare SC (SC1) - stop SMS on main SC (SC0) sc0:# /etc/init.d/sms stop - bringdown spare (SC1) sc1:#init 0 - boot spare (SC1) to activate pkgs and become main OK> boot -rv

Update the SC and CPU flash PROMs on the new main SC (SC1) - switch user to sms-svc - flash SC: sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/SCOBPimg.di sc1/fp0 sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/nSSCPOST.di sc1/fp1 CP1500 only sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/oSSCPOST.di sc1/fp1 SCV2(cp2140) only - flash SBs: sc1:sms-svc:> flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash sb0 sb1 sb2 ect... (must specify location for sms 1.4.1) - bring down sc1 sc1:# init 0 - boot sc1 OK> boot -rv

Upgrade the former main SC (SC0) - Download your SMS packages: www.sun.com/servers/sw (make sure to run cksum and compare) - unzip file and note location - stop SMS on the former main (SC0) sc0:# /etc/init.d/sms stop - Backup SMS on former main (SC0) (optional) sc0:# smsbackup (can add UFS dest dir. default: /var/tmp)

Upgrade Solaris (optional) sms 1.4.1 will work with sol8 and sol9. There is a different SMS package for each O/S version. Sol8 02/02 Sol9 04/04 (if you upgrade O/S add all patches and reboot. stop sms again if rebooted)

Page 106

Page 112: Solaris OBP Reference Guide

smsupgrade 1.4.1: (Cont...)

Upgrade SMS on former main (SC0) - cd to download directory sc0:# cd /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Tools -smsupgrade sc0:# ./smsupgrade /download_dir/sms_1_4_1_sparc_System_Management_Services_1.4.1/Product

Reboot the former main SC (SC0) - bringdown former main (SC0) sc0:#init 0 - boot (SC0) to activate pkgs and become main OK> boot -rv

Update the SC PROMs on the former main SC (SC0) - switch user to sms-svc - flash SC: sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/SCOBPimg.di sc0/fp0 sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/nSSCPOST.di sc0/fp1 CP1500 only sc0:sms-svc:> flashupdate -f /opt/SUNWSMS/firmware/oSSCPOST.di sc0/fp1 SCV2(cp2140) only - bring down sc0 sc0:# init 0 - boot sc0 OK> boot -rv Verify chasis serial number main SC (SC1) - switch user to sms-svc - check chasis serial # sc1:sms-svc:>showplatform -p csn - record serial # sc1:sms-svc:>setcsn -c serial_numb

Enable failover on main SC (SC1) sc1:sms-svc:>setfailover on

Solaris 9 SVM (sds) disk replacement: (also see infodoc ID73132 ) Beginning with Solaris 9, SVM uses a new feature called Device-ID which identifies each disk not only by it's c#t#d# name, but by a unique ID generated by the disk's WWN or serial number.

Mirrored disk replacement: (use when submirror “ State: Needs maintenance” in metastat cmd) On failing disk: (If you can access the disk, if not start at the cfgadm -c unconfigure step) # umount filesystem (unmount any non-svm open filesystems on failed disk) # metadb -d c1t0d0s7 (if replicas on this disk, remove them) # metadb | grep c1t0d0s0 (verify there are no existing replicas left on the disk) # cfgadm -c unconfigure c1::dsk/c1t0d0 (might not complete command if busy, remove failed disk)

Insert a new disk : # cfgadm -c configure c1::dsk/c1t0d0 (configure new disk) # prtvtoc /dev/rdsk/c0t0d0s2 > /tmp/firstdisk (get format for new disk) # fmthard -s /tmp/firstdisk /dev/rdsk/c1t0d0s2 (format disk same as mirror) # metadevadm -u c1t0d0 (will update the New DevID) # metadb -a c1t0d0s7 (if necessary, recreate any replicas) # metareplace -e d0 c1t0d0s0 (do this for each submirror on the disk) # metastat -i (will change unavailable state of devices to Okay)

Raid-5 disk replacement: (use when raid unit “ State: Needs maintenance” in metastat cmd) On failing disk:(If you can access the disk, if not start at the cfgadm -c unconfigure step) # umount filesystem (unmount any open non-svm filesystems on this disk) # metadb -d c1t0d0s7 (any replicas on this disk, remove them) # metadb | grep c1t0d0 (verify there are no existing replicas left on the disk) # cfgadm -c unconfigure c1::dsk/c1t0d0 (might not complete command if busy, remove the failed disk)

Page 107

Page 113: Solaris OBP Reference Guide

Solaris 9 SVM (sds) disk replacement: (cont...) Insert a new disk : # cfgadm -c configure c1::dsk/c1t0d0 Run 'format' or 'prtvtoc' to put the desired partition table on the new disk # metadevadm -u c1t0d0 (will update the New DevID) # metadb -a c1t0d0s7 (if necessary, recreate any replicas) # metareplace -e <raid5-md> c1t0d0s0 (do this for each raid on the disk) # metastat -i (will change unavailable state of devices to Okay)

SC rebuild after total disk failure: (Sun Fire 12k/15k)Use this procedure after disk replacement to rebuild an SC that experienced a total disk failure.This is a modified version of the `Fresh Installed SCs' portion of the 12k/15k & 20k/25k EIS checklist.http://sunweb.germany/EIS/Web/inst-support/checkl.html.Note: A smsbackup from the other SC on the platform should not be restored on the failed SC. The smsbackup filemust come from the same SC that failed.Items needed:

Solaris OE CDs (operating system install)SMS Software (http://www.sun.com/servers/highend/sms.html) EIS CDssmsbackup file (from failed SC or ID-PROMs from service call)explorer output file (from failed SC http://proactive.central)

On Main SC as user sms-svc: setfailover offOn Failed SC at ok prompt: check OBP settings: setenv auto-boot? false

setenv diag-level pmax-epvmax setenv input-device ttya setenv output-device ttya setenv local-mac-address? true setenv diag-switch? true setenv post-on-sir? true setenv diag-device <same as boot-device>

- Inital SC bootup: boot cdrom.- Get Solaris install info from explorer output (/etc/nodename, /etc/hosts, /etc/nsswitch.conf, /disks/prtvtoc ect...)

you can also reference install docs and customer supplied info.- Install SC as per EIS "Install Spec". Entire Distribution is required.

- Install Solaris & select manual reboot. - Fix the "No SOF Interrupt" problem. Append to /a/etc/system: exclude: drv/ohci (Makes booting much faster)

- Touch /a/etc/notrouter Disable routing. - Reboot SC.

- Log in as user root. - Insert EIS-CD-ONE Copy the EIS-CD to the system disc: cd /cdrom/...sun/install; sh copy-cd2sun.sh

- Insert EIS-CD-TWO. Copy the EIS-CD to the system disc: cd /cdrom/...sun2/install; sh add-cd2sun.sh - Edit /etc/dfs/dfstab Share directory /sun - Run setup-standard as user root: cd /sun/install; sh setup-standard.sh

(Do NOT select option to install SAN Foundation Suite.) (PTS recommends activation of alternate break sequence on SCs)

- Log out & back in to set environment. Or enter: . $HOME/.profile - Ensure that NIS is not configured. (If NIS/NIS+ used "files" must be first in /etc/nsswitch.conf.) - Install Solaris patches: Recommended Cluster and Additional Solaris Patches (/sun/patch/<SolarisVn>)

- Solaris 8: Verify entry in /etc/system set TS:ts_sleep_promote=1 (EIS-ALERT#22) - Fix sendmail messages "My unqualified host name unknown" (/etc/hosts append <hostname>.somewhere.com)

- Reboot Page 108

Page 114: Solaris OBP Reference Guide

SC rebuild after total disk failure: (cont...)

- Install SDS/SVM software. - Patch the SDS software (Solaris 8 only). /sun/patch/sds/<Vn>

- Install the SMS software on failed SC. (web release: http://www.sun.com/servers/highend/sms.html)- Patch SMS software /sun/patch/SMS/<Vn>- As root run smsrestore on failed SC. Use file from smsbackup or install the IDPROM files obtained via the service call. - Reboot SC. - Mirror the system disk. See scripts on EIS-CD in /sun/tools/SF15K (SDS Infodoc 28196) - Set boot-device & diag-device to both sides of the mirror. (SDS: sds-disk, sds-mirror) (See Infodoc 11854) - If NVRAM editor (nvedit) was used ensure to setenv use-nvramrc? true - Set up UFS-ACLs for user sms-svc on SC. As root run script sms-svc-setup.sh (EIS-CD: sun/tools/SF15K) - As user sms-svc: touch $HOME/.hushlogin - Append "share cdrom* -o ro,anon=0" to /etc/rmmount.conf - Share /export/install if not already. (/etc/dfs/dfstab) - Set up /etc/defaultrouter according to customer requirements. - Verify connectivity to defaultrouter (eg via ping). - Execute smsconfig -m on failed SC. Use data from explorer output from failed SC, Cu supplied info reference.

(if you restored the smsbackup for the failed SC, select 'Edit Network Settings'. All the IP hostnames will be populated and you will only have to supply the IP addresses and save the settings. smsconfig will populate your host, netmasks and hostname files.)

(if you did not have the smsbackup file, and restored the IDPROM files, you will have to Set platform name and change base ip addresses if necessary. Use explorer output from failed SC, Customer supplied info for reference. Also see infodoc ID71490)

- The smsconfig -m command modifies the hosts file. Check it to be sure things are as they should be. - Verify auto-boot?=true, watchdog-reboot?=false (eeprom auto-boot?, eeprom watchdog-reboot?)- Shutdown newly loaded SC and do hard reset. (Press reset button on SC).

On MAIN SC as user sms-svc: setfailover on Wait 5 minutes.....On MAIN SC as user sms-svc: Verify setfailover (showfailover -v) and showdatasync are "ACTIVE" to propogate

changes to spare SC.

- Run explorer and SunCheckup on both SCs, compare outputs and correct any errors.- When datasync is completed: On Main and spare SC, make a backup copy of sms files (smsbackup)

15K DR examples: (also see serengetti/15k dr commands page 87, infodoc 76795 How to DR a Single PCI Card) (cfgadm commands run from domain)

# cfgadm -val (get name “app ID” of board to use with cfgadm -c 'disconnect' or configure command)# cfgadm -val | grep permanent (see what SB has perm memory)# cfgadm -c disconnect SB0 (removes SB0)# cfgadm -c configure SB0 (adds SB0 back into domain)# cfgadm -c disconnect IO1 ( removes IO1 and all pci adapters on it)# cfgadm -c configure IO1 (configures IO1 back into domain) IO PCI slot #s# cfgadm -c disconnect pcisch5:e01b1slot0 (removes pci card in IO1 slot 0) | 3 | 1 |# cfgadm -c disconnect pci_pci0:e00b1slot1 (removes pci card in IO0 slot1) | 2 | 0 |

# cfgadm -c configure pcisch5:e01b1slot0 (configures pci card in IO1 slot 0 into domain)15/25K hpost:

sms-svc> hpost -d r -l127 (run hpost on domain R level 127).postrc (etc/opt/SUNWSMS/adm/config/platform or A-R)

level 64 (run level 64)dash_H_level 127 (run level 127 when DRing a board into domain)

no_ioadapt_ok (test SB only. Good when you create a test domain w/o IO)no_obp_handoff (when testing SB only don't attempt to load obp)

Page 109

Page 115: Solaris OBP Reference Guide

SMSbackup: (how to manually expand backup file) also see infodoc 77357

- Copy backup file to /tmp- sms-svc> cpio -icvdum < /tmp/sms_backup.1.4.1.cpio.0

3310/3510 Disk replacement: (also see infodoc 78432 and page 84)

- save nvram info: system functions, Controller maintenance, Save NVRAM to disks, yes- Identify bad disk: view and edit scsi device, look for BAD or FAILED status, note Chl, Id and LG_DRV #s, select bad drive, Identify scsi drive, flash all But Selected drive, Flash Drive Time, yes (go find the disk)

disk ID #s(single bus 3310) disk ID #s (dual bus 3310 ) disk ID#s (3510) Chl 0 Chl 2 Chl 0 Ch 0 / Ch2 0 3 8 11 0 3 0 3 0 3 6 9 1 4 9 12 1 4 1 4 1 4 7 10 2 5 10 13 2 5 2 5 2 5 8 11

- Physically unseat bad disk, let spin down 20 sec, then remove- Install replacement disk - view and edit scsi device, look for NEW_DRV or USED_DRV status.

If not seen: select a disk, Scan scsi drive, select Chl (use noted #), select Id# (of replacement), yes- Is replacement to be new local or global spare? If not skip to copy and replace step if so: view and edit scsi device, select replacement disk, add Global spare drive or add Local spare drive, yes- If replaced disk cannot be spare. view and edit logical drives, select logical drive, select PREVIOUS spare

disk, copy and replace drive, yes (when copy is completed assign PREVIOUS spare back in step above)

How to mount a CD image file (.iso) as a filesystem: (see SRDB 50566)

# lofiadm -a /export/install/sol-10-b72-sparc-v1.iso (must use absolute path to iso file)/dev/lofi/1# mkdir /cd1 (create a mount point) # mount -F hsfs -o ro /dev/lofi/1 /cd1 (mount /dev/lofi/# on the mount point)# df -k /cd1Filesystem kbytes used avail capacity Mounted on/dev/lofi/1 239904 239904 0 100% /cd1

To mount a slice of an .iso image (like s1 when doing a 'setup_install_server')

# mkdir /s1 (create the mountpoint)# dd if=sol-10-b72-sparc-v1.iso of=vtoc bs=512 count=1 (make a copy of the vtoc)# od -D -j 452 -N 8 < vtoc (starting cyl and block length for s1 is 452 bytes into vtoc and is 8 bytes long)0000000 0000000750 0000857600 (slice1 starts at cyl 750 and is 857600 blks long)0000010 # echo 750*640 | bc (Starting cyl750 *blks/cyl always 640 = s1 starting blk is 480000)480000# dd if=sol-10-b72-sparc-v1.iso of=sol-10-b72-sparc-v1-s1.iso bs=512 skip=480000 count=857600# lofiadm -a /export/install/sol-10-b72-sparc-v1-s1.iso/dev/lofi/2# mount -F ufs -o ro /dev/lofi/2 /s1# df -k /cd1 /s1

Filesystem kbytes used avail capacity Mounted on/dev/lofi/1 239904 239904 0 100% /cd1/dev/lofi/2 402086 397100 0 100% /s1

Page 110

Page 116: Solaris OBP Reference Guide

Removing the top cover on a V20z: (very tricky :-)

Keep top button down, pull cover forward until click, slide to the rear.

Explorer -w scextended from cron:

- Add IP address and password (if used) of SC to the /etc/opt/SUNWexplo/scinput.txt file.- run crontab -e and add -w default,scextended to the explorer entry ex: 0 0 * * 1 /opt/SUNWexplo/bin/explorer -q -e -w default,scextended

Useful COD commands: ( to obtain a license www.sun.com/licensing) 5.14.00 and up see Info doc 81531

showcodlicense (-r) addcodlicense sc> addcodlicense 01:80d8a855:000000000:0201010100:c:00000000:BLqg5Ko deletecodlicense enablecodboard <sb#> Used to replace a COD sb (need service passwd on Sun Fires) showcodusage showplatform -p cod (addcodlicense will populate this area) setupplatform -p cod showboards

A LOM4v: Niagra (Ontario, Erie) (initial login/password admin/admin1) also see ALOM commands on page 94

Removed in ALOM4v: Reduced managed system interface: Solaris 'scadm' , Solaris 'locator', 'prtfru' cannot access DFRUID PROMs, 'prtdiag'/'prtpicl' no environmentals. ALOM Alerts not forwarded to host syslog ALOM 'setsupsc' questions related to managed system interface removed.

Removed ALOM environment variables: sys_eventlevel, sys_hostname, ALOM cannot detect hung OS

Removed ALOM variables: sys_autorestart, sys_xirtimeout, sys_wdttimeout "No CPU Signature (OBP and OS Status) support! ALOM 'showplatform' cannot display Booting/OS Running state, stops at running sys_bootrestart, sys_bootfailrecovery, sys_maxbootfail, sys_boottimeout

New in ALOM4v: Password recovery (procedure on page 113) If the admin password is lost/forgotten, can reset the NVRAM to factory defaults, including clearing all users. Requires physical access to the machine to unplug power cords and connect to ALOM serial port. " Flashupdate protection ALOM flash is in two segments with a persistant switch. 'flashupdate' always operates on the non-running segment. Segments are only switched after flashupdate

completes and image is CRC verified. A jumper can also switch the segments. Ex: sc> flashupdate -s 129.148.173.99 -f /tmp/122430-01/System_Firmware-6_1_2-Sun_Fire_T2000.bin-latestSupports new LED States:

White locator LED flashes at 4Hz when activated. Green LED states: Standby blink: 0.1sec on, 2.9sec off. When system is on standby power Slow blink: 0.5 sec on, 0.5sec off: When system is in transition (running POST, powering down, etc) Steady ON: system is running Amber LED states: Off: No faults. On: Service required. Amber slow blink to indicate unacknowledged faults not supported.

Page 111

Page 117: Solaris OBP Reference Guide

ALOM4v: (cont)

New in ALOM4v: ALOM handles the fault by: Lighting the Fault LED(s) Logging the fault to DFRUID of the indicted FRU(s) Alerting the user using ALOM alerting mechanisms: To logged-in ALOM users To an email address (if configured) "

New ALOM commands: showfaults Prints any faults Environmental faults, faulty FRUs, POST-detected faults, which result in ASR-disable

FMA-detected faults, prints the time and status of the last POST run.clearfault <UUID> to manually clear an FMA-diagnosed fault. (get UUID from showfaults output)ASR commands:

showcomponent view and manage the list of blacklisted (ASR-disabled) devicesenablecomponent disabled state is stored on the actual FRU, such as the DIMM itself. disablecomponent A FRU disabled on one system will remain disabled when inserted in another system clearasrdb

setkeyswitch normal: System can be used normally. stby: Powers off the system and prevents 'poweron' command or button from operating. diag: Forces the system to run servicemode diagnostics at next reset. locked: Prevents 'flashupdate' and 'break' commands, system can power on/off and reset normally.

showkeyswitch showfru command prints both static and dynamic sections setfru command to set Customer_DataR in all FRUs showhost version command to print the software versions contained in the Host flash prom. obpupdate command to update the Host flash prom (POST, OBP, etc). 'obpupdate' and 'flashupdate' will be merged

into a single command which will update both ALOM and the Host flash from a single master imageflash host promServicemode commands: Be sure to set sc_servicemode to false when done! setsc sc_servicemode true Warning: misuse of this mode may invalidate your warranty. showplatform -v will print CPU #Cores and version information. "

ping <ipaddress> - test network connectivity clearnvramlog - erases persistent 'showlogs -v' frucapture - offload a FRU's DFRUID image via FTP fruupdate - update (overwrite) a DFRUID image via FTP setcsn - set the chassis serial number, required when replacing the PDB board.

Can only be executed one time and only with a blank (new) PDB fmagentconfupdate - field update FMA agent via FTP showfmfaults - show current FMA faults stored on the DOC (Disk-on-chip) showfmerptlog1 - show the first 40 ereports on DOC showfmerptlog2 - show the last 40 ereports on DOC clearereports - clear the ereport logs from DOC docftpput - FTP a DOC file off of ALOM. " Note: the above command names may change by product ship! spdiag consists of the following commands: i2ctest - run a single pass of the i2c test envtest - run a single pass of the environmental test sptest - run a single pass of the SP diag tests setdiagopt - set diag test options used by 'rundiag' rundiag - start diagnostics in the background stopdiag - stop any running background diagnostic tests showdiagstatus - show the status of background tests resetdiagstatus - reset the diagnostic status Servicemode: spdiag suite

Page112

Page 118: Solaris OBP Reference Guide

ALOM4v (cont...)

diagnostics run environment variables: diag_trigger: when POST runs. Valid triggers: none, power-on-reset, user-reset, error-reset, all-resets diag_verbosity: verbosity level of POST, one of: none, min, normal, max, or debug diag_level: level of testing performed, one of: none, min, or max. diag_mode: POST mode, one of: off, normal, service, or menu sys_autorunonerror: Controls if the system should continue boot if POST finds an error. Set to true or false.

Forgotten password ALOM4v : Niagra Ontario, Erie1.Connect to the ALOM serial port2. Power cycle the server by unplugging both PSU cords and re-plugging 3. Hit "esc", the Escape key, during ALOM boot at the point: Return to Boot Monitor for Handshake 4.After hitting "esc", the ALOM boot escape menu will be printed:

ALOM <ESC> Menu e - Erase ALOM NVRAM. m - Run POST Menu.

R - Reset ALOM. r - Return to bootmon. Your selection: Enter "e" to erase the ALOM NVRAM and then 'r' to resume ALOM boot. ALOM will now boot and resetall NVRAM settings. You will automatically be logged on as user 'admin' with no password and no permissions, and all ALOM NVRAM settings will be reset to the factory defaults.

Solaris to Linux cross-reference: ( http://www.unixporting.com/quickguide.html and Linux overview for Solaris users817-3341-10)

Solaris Linux Description

System Administration Tools/usr/bin/admintool /bin/linuxconf system administration tasks/usr/sbin/useradd /usr/sbin/useradd adds a new userKernel Configuration/etc/system /usr/src/linuxProcesses/usr/bin/ps -ef /bin/ps -ef active processes/bin/truss /usr/bin/strace trace of the system/usr/ucb/users /usr/bin/users users currently on the system/usr/ucb/ps -aux /bin/ps -aux active processes sorted by %cpu/usr/bin/prstat /usr/bin/top active processes, reports statisticsPhysical Memory/usr/sbin/dmesg | grep mem grep MemTotal /proc/meminfo memory sizeHardware Status/Information/usr/bin/dmesg /bin/dmesg system buffer diagnostic messages /usr/bin/arch -k /bin/uname -m application architecture of host systemHost ID/usr/bin/hostid /usr/bin/hostid lists host idHostname/usr/bin/hostname /bin/hostname lists hostname/usr/bin/uname -a /bin/uname -a lists hostnameSwap/usr/sbin/swap -a /sbin/swapon -a add swap space/usr/sbin/swap -l /usr/bin/free lists swap infovmstat vmstat virtual memory statisticsSystem Files/etc/vfstab /etc/fstab filesystem default info/etc/inet/hosts /etc/hosts network hosts file Page 113

Page 119: Solaris OBP Reference Guide

Solaris to Linux cross-reference: (cont...)

Solaris Linux Description

The X Window System/usr/openwin/bin/xterm /usr/X11R6/bin/xterm terminal emulator for x windows/usr/openwin/bin/xhost /usr/X11R6/bin/xhost allowed connections to the X serverNetworking/usr/sbin/showmount /sbin/showmount clients that remotely mounted a filesystem/etc/dfs/dfstab /etc/exports sharing resources /usr/sbin/route /sbin/route manipulate the routing tables/usr/bin/netstat /bin/netstat show network status/usr/sbin/ifconfig /sbin/ifconfig configure network interface parameters/usr/sbin/snoop /usr/sbin/tcpdump displays network packets and their contentsCopies/usr/bin/cpio /bin/cpio copy files/usr/sbin/tar /sbin/tar copy filesSoftware/usr/sbin/pkgadd /bin/rpm -i[U]vh add software pkg/usr/sbin/pkginfo /bin/rpm -qa displays software pkg info/usr/sbin/pkgrm /bin/rpm -e removes software pkDisk Formatting/usr/sbin/format /sbin/mke2fs creates partitionDisk Partitioning/info/usr/sbin/format /sbin/fdisk creates partition/usr/sbin/format /sbin/fdisk -l lists partition infoDisk Space and Information /usr/sbin/df /bin/df displays mounted file systems /usr/sbin/df -k /bin/df -k displays disk space of file systems /usr/sbin/mount /bin/mount mounts a file system /usr/bin/du /usr/bin/du displays disk usage Log Files /var/adm/messages /var/log/messages system Log fileMiscellaneous /usr/ucb/whoami /usr/bin/whoami displays current user name /usr/bin/fdformat /usr/bin/fdformat floppy disk format /usr/bin/tip /usr/bin/minicom terminal connect thru serial port /usr/bin/find /usr/bin/locate find a file /usr/bin/who -r /sbin/runlevel displays current run level

SSH - Secure Shell :SSH (Secure Shell/Secure socket shell) is a secure Unix command interface and protocol that enables the user to haveremote access to a device located on a network. SSH is built of three different utilities, slogin, ssh, and scp - these are allsecure versions of existing Unix ultilities, rlogin, rsh and rcp. All SSH commands and sessions are encrypted to enhancesecurity during a remote session. In most cases, if you have to connect via ssh to a server, ICMP (ping) will be disabled.In other words you will not be able to ping the server.

Commands for ssh users:

ssh hostname connect to hostname using ssh ex: # ssh - l root 129.148.173.230 slogin hostname you can use ssh and slogin interchangeably ssh hostname command run command remotely on hostname ssh -v hostname connect in verbose mode for debugging ssh -V determine version number for your copy of ssh

Page114

Page 120: Solaris OBP Reference Guide

Commands for ssh users: (cont.)

ssh-keygen generate a new public/private key pair ssh-keygen -c myuserid-ssh2@pha generate new key pair with identifying comment sftp hostname copy files interactively between hosts (requires SSH2). Commands for an sftp session are similar to standard ftp. scp filename hostB:filename copy file from current computer to hostB scp1 filename hostB:filename copy file from current computer to hostB (use if hostB only supports SSH1) scp hostA:filename hostB:filename copy file between two computers scp -r hostA:dirname1 hostB:dirname2 copy directory (and its contents) between two computers scp hostA:fn1 hostB:fn2 copy and rename file between two computers scp fn1 fn2 fn3 hostB:directoryname copy multiple files into hostB's directory ssh-agent command run command (usually a shell) under control of ssh-agent ssh-add add local identity to list maintained in memory by ssh-agent ssh-add filename add identity whose private key is stored in filename to list in memory ssh-add -l list keys stored in memory ssh-add -D delete all keys stored in memory

Commands for ssh maintainers

ssh-keygen -P /etc/ssh2/hostkey generate & store a new host key

SSH with SMS 1.5 smsinstall command will automatically harden your SC, smsupgrade will not.(Bug ID: 5079760)to undo hardening: (pg50 SMS 1.5 Installation Manual: )

1. login to SC as superuser 2. Type at sc:# prompt: /opt/SUNWjass/bin/jass-execute -u 3. The system will prompt you with an `undo' menu 4. Select `run' number you want to undo 5. type q to exit 6. reboot system

To manually harden a SC with SMS 1.5: (note telnet, rlogin, ftp, vold will not work so make sure you serial console access before you harden it) infodoc 83763

# /opt/SUNWjass/bin/jass-execute -q -d sunfire_15k_sc-secure.driver

Galaxy ILOM: (default login/password root/changeme)

ILOM (Integrated Lights Out Manager) (Motorola MPC8248 Service Processor): Provides RKVMS functionality (Remote Keyboard, Video, Mouse and Storage. Default is not enabled for LAN.) Provides ability to boot from virtual devices. CLI through serial connection or SSH. Environmental monitoring (voltage, fan speeds, temperatures, etc. and will send alert messages.) Allows for LOM. Embedded Web Server w/ SSL encryption. (connect to web GUI by: https://ipaddress) Flash memory for built-in Linux OS. Connects to all components via JTAG connection. IPMI v2.0 command interface SNMP v1, v2c and v3 interface. CLI, Web GUI or ILOM Remote Console to manage.

To Power on: To turn on main power mode (all components powered on), press and release the small Power button on the server front panel. When main power is applied to the full server, the Power/OK LED next to the Power button lights and remains lit. or

Page115

Page 121: Solaris OBP Reference Guide

Galaxy ILOM: cont... (Connect a serial cable from the RJ-45 Serial Mgt port on your ILOM SP to laptop) -> start /SYS

To Power off: press and release the small Power button on the server front panelor -> stop /SYS

Configuring the SP: (Serial Port default: 9600/8/1/none ) cd /SP/network set /SP/network pendingipaddress=192.168.0.1 set /SP/network pendingipnetmask=255.255.255.0 set /SP/network pendingipgateway=192.168.0.10 set commitpending=true show /SP/network

To start the serial console: (Connect a serial cable from the RJ-45 Serial Mgt port on your ILOM SP to laptop) -> cd /SP/console start `esc ( ` to return to SP

eeprom default is screen and keyboard. Use solaris eeprom command to get serial console in solaris (ssh to host or see remote console below)

eeprom input-device=ttyaeeprom output-device=ttya

BIOS: You need to change the BIOS setting to have serial port controlafter POST. (this will not override the eeprom setting in solaris)to change setting:F2 (ctl-E) on reset, Advanced, Remote access Configuration, Redirect after POST [always](Some OSs may not work if set to always)

CLI <verb><options><target><properties> VERBS: See Sun Fire X4100 and X4200 Servers System Management Guide for guidance onCLI commands. cd Navigate the object namespace. create Set up an object in the namespace delete Remove an object from the namespace. exit Terminate a session to the CLI. help Displays help information about commands and targets. load Transfers a file from an indicated source to an indicated target. reset Resets the state of the target. set Sets target properties to the specified value. show Displays information about targets and properties. start Starts the target stop Stops the target. version Displays the version of service processor firmware running. Options: short-cuts -default n/a Causes the verb to perform only its default functions. -destination n/a Specifies the location of a destination for data. -display -d Shows the data the user wants to display. -examine -x Examines the command but does not execute it. -force -f Causes an immediate shutdown, instead of an orderly shutdown. -help -h Displays help information. -level -l Executes the command for the current target and all targets contained through the level specified. -output -o Specifies the content and form of command output. -resetstate n/a Resets the state of the target to its default. -script n/a Skips warnings or prompts normally associated with the command. -source n/a Indicates the location of a source image. Page 116

Page 122: Solaris OBP Reference Guide

Galaxy ILOM: cont...

Contents of /SYS and /SP-> cd SYS/SYS - > show /SYS Targets: FIOBD FT0 FT1 MB PDB PS0 PS1 SASBP Properties: ACT = standby_blink FAN_FAULT = off LOCATE = off POWERSTATE = off PSU_FAULT = off SERVICE = off TEMP_FAULT = off Commands: cd reset show start stop

-> cd ../SP -> show //SP Targets: alert cli clients clock console logs network serial services sessions users

Properties:

Commands: cd reset show version

Web Gui allows you to: (To log on, use https://ipaddress) redirect graphical console to remote host. connect a virtual floppy or CD-ROM drive. monitor and manage fans remotely. monitor BIOS messages, OS messages and system status remotely. interrogate NICs for MAC remotely. Power on, off and reset remotely Remote Console (RKVMS): (requires Java 5.0 or higher) You can use Remote Console to get remote console, keyboard, mouse access to the server and to install s/w from local CD drive. Open a browser https://SP_ipaddress From Remote Console, choose Redirection->Start Redirection->Devices->Mouse/Keyboard/CD-ROM

USERs: Can't delete the following accounts:root/anonymous/ldapproxy Can create an additional 7 accounts.

Send break: When logged into the SP using ssh with a console session running,: ESC + Shift-b

Page 117

Page 123: Solaris OBP Reference Guide

Revision History:

First release 01/17/00Corrections:

02/14/00 page 30 punzip to gunzip06/21/00 page 19 d = on bd soc+ (was in wrong place)

Additions:02/28/00 page 39 Uncompressing files03/14/00 page 40 - 43 T300 03/27/00 page 28 * #TERM=vt100; export TERM 03/27/00 page 44-45 ACT 05/18/00 page 46-48 Advantages of Splitting a Drive into Multiple

File Systems05/19/00 page 48-49 How to configure a system to run on a network05/19/00 page 49-51 SEVM - How to recover a primary boot disk.05/23/00 page 51 Disable DMP 06/16/00 page 52 Memory Scrubbing07/20/00 Page 13 metastat command added to Disk Suite sec.07/20/00 Page 16 raidutil commands07/21/00 Page 52 Display remote App GUI locally08/26/00 Page 53 Cluster 2.x10/13/00 Page 41 T300 Pgroup secondary disk addressing failover note10/16/00 Page 31 mpstat command added10/17/00 Page 40 T300 Pgroup, 2 fiber path data transfer usage note11/09/00 Page 21 isainfo - v command added11/09/00 Page 42 T300 tftp boot (examples added)11/09/00 Page 56 Encapsulating root after using Environmental CD to

load O/S:11/20/00 Page 42 Warning added (:/: sys blocksize (n)k should be

set to correct value before 'vol add')11/20/00 Page 56 Adding a second network interface (without boot)11/20/00 Page 56 Adding a default gateway11/27/00 Page 53 OPS general description12/12/00 Page 57 Volume Manager12/28/00 Page 60 FTPing to and from sunsolve 02/06/01 Page 56 /etc/name_to_major (cluster warning added)02/06/01 Page 56 /etc/defaultrouter added04/24/01 Page 56 /etc/notrouter (warning added to 2nd interface)04/24/01 Page 61 Serengeti added04/24/01 Page 30 info on new explorer added04/24/01 Page 67 Mounting CD without vold05/20/01 Page 66 Notes added06/20/01 Page 16 Update A3500 info and rm6 commands06/20/01 Page 42 modify Enable/Disable command descriptions06/20/01 Page 43 modify disk and lpc download descriptions06/20/01 Page 62 modified repeater bd info (removed 3800 4800

warning on dual partitions)06/20/01 Page 67 mailx: send messages/files06/20/01 Page 59 take -g out of vxdg import and export example06/20/01 Page 60 no longer able to create directories on ftp sunsolve 06/20/01 Page 43 Warning added on controller firmware upgrade06/20/01 Page 61 * when available (added)07/23/01 Page 28 VTS description change (removing "on-line")11/06/01 Page 67 T3 forgotten password 11/06/01 Page 67 T3 logging11/06/01 Page 21 -k added to netstat command

Page 124: Solaris OBP Reference Guide

11/06/01 Page 33 dd, added a disk to disk quick copy example11/08/01 Page 1 note addad 'or disk@n for PCI '12/03/01 Pages 68 -73 StarCat 15k notes12/12/01 Page 73 local-mac-address 12/12/01 Page 73 SDS- How to mirror root01/28/02 Page 16 raidutil command switches fixed ( -B and -R)02/21/02 Page 68 added fin I0771-1 information02/21/02 Page 68 added SC console port pinout05/08/02 Page 55 # scconf -N (to change a node ethernet address )05/08/02 Page 8 (7-127) added (E10K hpost levels)05/08/02 Page 73 smsconfig -m (added IPMP info)05/08/02 Page 68,69 smsconfig -m info added05/08/02 Page 75 IPMP05/20/02 Page 11 Added SSP3.4 information06/06/02 Page 76 T3B or T3+ Firmware Rev 2.1 New Functions:06/10/02 Page 77 Hitachi StorEdge 99X0 Arrays:06/13/02 Page 78 SunFire forgotten password06/17/02 Page 64 Sun Fire setfailover, showfailover cmds added07/16/02 Page 60 Updated 'ftp to sunsolve' with rftp07/17/02 Page 80 StorEdge Network FC Switch07/25/02 Page 80 Added to FC switch info 07/29/02 page I added: http://webhome.east/boston/ to disclaimer08/09/02 Page 31 added 'top' command10/01/02 page 81 9900v notes added10/07/02 Page 84 Minnow info added10/25/02 Page 72 flashupdate-f opt/SUNWSMS/hostobjs/sgcpu.flash10/28/02 page 86 Tuning ecache scrubber scan rate10/30/02 Page 86 VxWorks commands (serengeti)11/04/02 Page 11 Add syntex for share cdrom for VTS11/08/02 Page 87 LVD adapter information (ultra scsi-3 375-3057)11/12/02 Page 54 ccdadm command added for ccd.database recovery11/12/02 Page 87 changed step sequence for booting image11/20/02 Page 65 added (-x) to domain reset command11/20/02 Page 10 redlist definition added11/27/02 Page 87 Replaceing a nordica bd in a 15K SC12/04/02 Page 66 remove firmware bugs add firmware matrix12/04/02 Page 87 Add serengetti DR commands12/06/02 Page 85 Added to Minnow info 12/11/02 Page 66 add to firmware matrix01/14/03 Page 66 modified logging information01/21/03 Page 64 added `service' and `testinterconnect' commands02/03/03 Page 88 Clean up non-root disk “controler” numbers02/10/03 Page 88 Set network parameters at boot:03/04/03 Page 80 Useful SAN commands03/04/03 Page 79 Default Storage switch passwords03/05/03 Page 66 Modified firmware matrix (5.14.4)03/17/03 Page 59 added /opt/VRTS/bin/vea 03/17/03 Page 88 Starcat Portid cheat sheet03/17/03 page 62 6800 partition info added03/21/03 Page 90 StorADE info added04/11/03 Page 89 Starcat SC: clean the slate04/11/03 Page 89 Starcat redx info 04/11/03 Page 75 rm + after depreciated under /etc/hostname.qfe0 :04/18/03 Page 90 get FRU info from serengetti05/15/03 Page 91 SWAP06/02/03 Page 92 Maserati Notes- StorEdge 6320 and 6120

Page 125: Solaris OBP Reference Guide

06/19/03 Page 11 removed 'slot' from sbus numbering formula07/08/03 Page 93 Flash Archive interactive install 07/09/03 Page 94 UltraSPARC III CPU Diagnostic Monitor (CDM)07/10/03 Page 89 add lines to Starcat SC: clean the slate.07/10/03 Page 94 SunFire Service Mode Password Generator 07/14/03 Page 94 added : To removeCDM07/21/03 Page 94 V440 ALOM, raidctl09/03/03 Page 60 update ftp info 10/24/03 Page 94 added setchs -s command to service mode11/06/03 Page 28 added navigation keys to sunvts11/06/03 Page 95 Finding Solaris release and distribution loaded01/20/04 Page 96 Network troubleshooting command, files, daemons01/26/04 Page 42, 76 volslice note added01/28/04 Page 72 Added SMS1.4 commands02/10/04 Page 97 How to find your way around a B1600... 02/12/04 Page 97 added default login and console info 03/01/04 Page 87 added # cfgadm -val | grep permanent03/09/04 Page 64 updated platform commands 5.16.003/09/04 Page 66 updated firmware matrix04/06/04 Page 72 add SB1 to flashupdate command04/06/04 Page 87 add 15k to dr command04/27/04 Page 103 Cluster 3.x05/31/04 Page 96 added to fileinfo /etc/dhcp.interface06/11/04 Page 106 added smsupgrade 1.4.1 info 06/28/04 Page 93 flasharch info add (use same release ex: sol9 40/04) 07/07/04 Page 60 suncore password change07/27/04 Page 106 Solaris 9 SVM (sds) disk replacement08/19/04 Page 108 SC rebuild after total disk failure08/27/04 Page 73 simplified sds mirror procedure08/27/04 Page 106 Made SVM replacement more universal09/16/04 Page 60 added password url to ftp sunsolve info09/23/04 Page 109 15K DR / hpost examples10/11/04 Page110 smsbackup: manually check a backup file11/11/04 Page110 3310/3510 Disk replacement:12/08/04 Page 64 3800-6800 navigation (ssh) #. added02/03/05 Page 64 added setchs showchs cmds02/03/05 Page 110 How to mount a CD image file (.iso) as a filesystem 02/03/05 Page 31 Added iostat (disk thruput test)04/19/05 Page111 Removing the top cover on a v20z04/26/05 Page111 Explorer -w scextended with cron08/03/05 Page111 Useful COD commands:08/23/05 Page111 ALOM4v Ontaeri/Erie(Niagra)08/23/05 Page113 Forgotten password (ALOM4v)09/13/05 Page 68 add details to 15k serial pinout09/26/05 Page 113 Solaris to Linux cross reference10/18/05 Page114 SSH information10/18/05 Page115 Galaxy ILOM info 10/28/05 Page 96 kstat -p, netstat -k added10/28/05 Page 95 Find local NIS servers12/08/05 Page109 Made 15k dr clearer (cfgadm -val)01/09/06 Page93 added -S to flarcreate example for faster archive01/17/06 Page 65 updated “remote logging”03/22/06 page 79 updated serengeti password reset03/28/06 Page 111 added niagra flashupdate example 04/04/06 Page116 added x4100 console information07/24/06 Page115 SSH with SMS 1.5


Recommended