2
April 7 th 2003Performance Troubleshooting.ppt page 2© 2003
26th DECUS Symposium
www.decus.de
Using the right tools to Troubleshooting
Performance Problemson
Tru64 UNIX®
Jan Mark HolzerSenior Member of Technical Staff
[email protected] UNIX® Divison
3
April 7 th 2003Performance Troubleshooting.ppt page 3© 2003
26th DECUS Symposium
www.decus.de
agendaconcepts
tools to troubleshootresources
page 4Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Concepts
page 5Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Concepts
§ follow the 5 P’s rule !!!
page 6Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Concepts
§ follow the 5 P’s rule !!!
Proper PlanningPreventsPiss Poor
Performance
page 7Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
a good tool selection method?
A: i = random(0, length(ToolList)-1);execute ToolList[i];if does not work, roll back;else commit;goto A;
method based on Cary Millsap’s Oracle performance method
page 8Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
a good tool selection method?
A: i = random(0, length(ToolList)-1);execute ToolList[i];if does not work, roll back;else commit;goto A;
method based on Cary Millsap’s Oracle performance method
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
selecting the right tools
§ use the right tool for the problem– understand the problem– isolate area of ‘concern’
§ often the obvious or commonly used tool might not be the right one§ use lightweight tools to avoid introduction of
additional overhead
page 10Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
troubleshooting concepts
§ understand Application Characteristics– how many processes?– how much IO? What are the patterns (bursts, or
over time)?– how much memory is used?– what kind of networking resources are needed?
(how many sockets, …)
page 11Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
concepts (cont)
§ determine Performance Requirements– number of simultaneous accessors§ mail, nfs, database, telnet
– ideal response time§ test Fulfillment of Requirements
– benchmark or simulator
page 12Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
concepts (cont)
§ understand capabilities and limitations of target environment
– read product specs– isolate the ‘bottlenecks’ (ie. limiting factors for
performance) in the current configuration§ DO NOT BELIEVE propaganda material (especially
from our friends west of the border ☺)
– draw/build map of storage subsystem/configuration§ can use sys_check output to some degree
page 13Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
concepts (cont)
§ read up– “System Configuration and Tuning Guide”– application specific guide– whitepapers– other people’s experiences
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Toolshttp://www.tru64.org/~jmh/
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Troubleshooting Guidelines
§ tools for troubleshooting and collecting– CSC data collection § sys_check (unicensus)
– performance data collection§ syd, top, monitor, ps, collect, vmubc, Performance
Manager, iostat, vmstat, volstat,advfsstat– other tools§ lsof, netconfig, kdbx scripts, truss, trace
– kernel/application data collection§ kprofile, uprofile, iprobe, DCPI, kdbx,
cda,om,atom,spike,lockinfo
page 16Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Reconfiguration(cont)
§ to see the maxium value for a given tunable# sysconfig –Q proc maxusers
§ all subsystems now online as man pages# man sys_attrs_SubSystemName
§ don’t have to reboot the system for modified tunable parameters to alter system operation.§ since all parameters are range checked based on the existing tuning
and configuration, the “only” harm that can occur is performancedegradation.§ 2 ways to reconfigure tunable parameters:
– the CDE/dx kerneltuner slide bars.# sysconfig -r <subsystem> <parameter> = <value># sysconfig -r vm vm_page_free_target = 1024
17
April 7 th 2003Performance Troubleshooting.ppt page 17© 2003
26th DECUS Symposium
www.decus.de
hardware information
hwmgrdsfmgr
emxmgr
page 18Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
emxmgr
§ emxmgr– Replaced in 5.1B with “hwmgr –view topology”– Useful to obtain FC topolgy information– Can look at § Adapter mappings§ FC Port Ids§ Ports logged in
page 19Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Hardware Management and multi-pathinginformation
§ V5.x introduced new device naming and native multi-pathing§ Automatic configuration and monitoring§ Today there’s no single utility to view per
path/device statistics– hwmgr(8) and collect(8) are the best options so far– Sometimes emxmgr(8) can be useful too
page 20Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
hwmgr the Swiss Army Knife
§ To view hardware topology# hwmgr –view hierarchy
§ To find EMX controllers# hwmgr –view hierarchy | grep –E “qbb|emx”or in 5.1B# hwmgr –view topology
§ Display “stale” path information# hwmgr –show scsi –full– “in kernel” view
# hwmgr –get attr current | egrep “dev_base_name|path_state”
page 21Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
hwmgr the Swiss Army Knife
§ Path usage# hwmgr –get attr current | egrep
“dev_base_name|path_state|path_xfer”§ “Cross RAD” I/O information
# hwmgr –get attr current | egrep\“dev_base_name|cross_rad|path_xfer”
§Hardware Database “inconsistencies”# hwmgr –show component –inconsistency (-full)
page 22Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
hwmgr –get attr current / Cross RAD I/Os
23
April 7 th 2003Performance Troubleshooting.ppt page 23© 2003
26th DECUS Symposium
www.decus.de
system information
sys_check
page 24Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
How to run sys_check for Tru64
§ Install kit (part of mandatory OS install V4.0E and later, or via setld(8) for internet kit)§ Log in as root, choose one of: ( /usr/sbin in path)
sys_check > file.html ( STD HTML output)sys_check -perf > file.htmlsys_check -all > file.html (Includes security info)sys_check -frames ( Frames output)
§ Typically runs in 10-30 minutes, HTML file is between .5 - 2MB in size§View file with browser
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
What does sys_check NOT do?
§ Does not:– gather every possible system parameter or operating
statistic– check / warn for every possible configuration or
operational problem– gather core files, or other binary files needed when
investigating system crashes (by default)– gather information on every possible layered product
(but it does do quite a few)– mail or otherwise transmit any data off the system.
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Some components sys_check gathers
§ Hardware (Boot information, firmware revs)§ OS Software (setld, Patches)§ Kernel (VM, configuration parameters)§ File systems (VFS, AdvFS, UFS, NFS, CDFS)§ Storage (disks, tapes, HSZs, SWXCR, LSM,
Prestoserve, Storage Map)§ Network (TCP/IP, routing, DECnet, LAT)§ Pathworks/ASDU, Oracle, SAP, Sybase, Informix,
Baan, Samba, Netscape, BMC§ Clusters (TCR Available, Production Server, V5.0)§ Printers
Frames sys_check output( No args)
Frames sys_check output( No args)
Frames sys_check output( No args)
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
System Monitoring Utilities
Your trial period is over
Please register now
OK
31
April 7 th 2003Performance Troubleshooting.ppt page 31© 2003
26th DECUS Symposium
www.decus.de
process&
systemmonitoring
sydmonitor
topvmubccollectvmstat
cda/crash
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
syd, top, monitor
§ display of important system statistics§ sorts process in various ways (CPU used)
– dxproctuner part of the CDE base Sysman suite§ many options for sorting output§ signals§ updates
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
syd
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
swapon -s
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmubc V5.1
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmubc V5.1
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmubc V5.1
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect
§ Home grown tool based on various benchmarking experiences ...
– collects everything you ever wanted to sample– collects to stdout or binary file for later analysis– PerlTK based GUI (collgui)– cfilt script for custom output (Excel spreadsheet
etc...)– Select subsystems, sample intervals, processes,
commands, totals, etc...– Looking for feedback/input from YOU
page 39Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Collect Dataflow Diagram
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect -h
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect -sn
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect -T
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sd -T
page 44Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sd –Ddsk? -T
page 45Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sd –Ddsk3 -T
page 46Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sp –n10 –S
page 47Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sp –P Csetitathome
page 48Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –scp –n7 -S
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect -sc
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sd / BusyTime
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collect –sd / BusyTime
page 52Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
collgui
page 53Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
§ collgui will then draw both expressions on a graph
collguisupports multiple concurrent graphs
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmstat -P
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmstat –P(2)
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
vmstat –R/–r RAD#
57
April 7 th 2003Performance Troubleshooting.ppt page 57© 2003
26th DECUS Symposium
www.decus.de
filesystemmonitoring
volstatadvfsstat
page 58Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Example: I/O Statistics for a LSM Volume# volstat -vpsd
OPERATIONS BLOCKS AVG TIME(ms)TYP NAME READ WRITE READ WRITE READ WRITEdm dsk1 130395 260827 8382362 16758335 18.8 85.5dm dsk2 130650 261303 8387544 16796853 18.0 78.3dm dsk3 130653 261128 8396046 16776460 18.1 51.8dm dsk4 0 69637 0 905281 0.0 8.1dm dsk261 130804 260827 8419760 16758335 20.0 99.8dm dsk266 130277 261303 8367152 16796853 18.0 89.8dm dsk267 130594 261128 8378784 16776460 18.4 86.1vol v1 393216 393216 50331648 50331648 22.6 155.6pl v1-01 196609 393216 25165952 50331648 22.2 98.8sd dsk1-01 130395 260827 8382362 16758335 18.8 85.5sd dsk2-01 130650 261303 8387544 16796853 18.0 78.3sd dsk3-01 130653 261128 8396046 16776460 18.1 51.8pl v1-02 196607 393216 25165696 50331648 23.0 120.0sd dsk261-01 130804 260827 8419760 16758335 20.0 99.8sd dsk266-01 130277 261303 8367152 16796853 18.0 89.8sd dsk267-01 130594 261128 8378784 16776460 18.4 86.1pl v1-03 0 69637 0 905281 0.0 8.1sd dsk4-01 0 69637 0 905281 0.0 8.1
page 59Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
AdvFS I/O queues in V4.0x
SynchronousI/O Request
AsynchronousI/O Request
Blocking Queue
Device Queue
Lazy Queue
WaitQueue
ReadyQueue
Consolqueue
Disk
Reads Metadata
page 60Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
AdvFS I/O queues in V5.x
SynchronousI/O Request
AsynchronousI/O Request
Blocking Queue
Flush Queue
UBC Request Queue
Device Queue
Lazy Queue
WaitQueue
SmoothSync
Queue
ReadyQueue
Consolqueue
Disk
Sync. Writes (fsync)
VM Requests
Reads Metadata
page 61Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
advfsstat
# advfsstat -i 5 -v 3 test_domain
vol1 vol2
rd wr rg arg wg awg blk ubcr flsh wlz sms rlz con dev rd wr rg arg wg awg blk ubcr flsh wlz sms rlz con dev
190 0 190 256 0 0 2K 0 0 0 0 0 0 0 0 178 0 0 177 254 0 0 2K 2K 2K 0 0 2K
181 0 181 256 0 0 2K 0 0 0 0 0 0 0 0 166 0 0 166 250 0 0 2K 2K 2K 0 0 2K
120 0 120 256 0 0 1K 0 0 0 0 0 0 0 0 206 0 0 204 252 0 0 6K 1K 1K 0 0 3K
rd number of reads on volume blk blocking queue requests on volume
wr number of writes on volume flsh flush queue requests on volume
rg number of consolidated reads on volume ubcr UBC queue requests on volume
arg average number of blocks per consolidated wlz wait lazy queue requests on volume
read on volume sms smooth sync queue requests
wg number of consolidated writes on volume rlz ready lazy queue requests on volume
awg average number of blocks per consolidated write on con consol queue requests on volume
volume dev device queue requests on volume
page 62Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
advfsstat - reads and writes
§ rd,wr– total number of io requests handed to the device
(consolidated + nonconsolidated)§ rg,wg
– number of requests that were consolidated§ arg,awg
– average consolidation size (limited by the transfer size of the device).
§ failure to consolidate may mean– disk fragmentation– random access patterns– thread competition
page 63Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
advfsstat – I/O queues (priority order)
§ blocking queue–all Reads and AdvFS metadata writes–high count may mean lots of reads
§ UBC request queue–VM flush requests–high count may mean memory contention
§ flush queue–synchronous writes (fsync)–bursts. Should not be continually high
§ wait queue–hold up metadata (write ahead logging)–high values may mean a device problem
page 64Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
advfsstat – I/O queues (priority order)
§ smooth sync queue– ages buffers for reuse.– low counts may mean aging too low, low write activity and
vice versa.§ ready queue
– Buffers are sorted here.– High counts may device problem.
§ consol queue– Prevents starvation behavior difficult to predict
§ device queue– Size determined by device– Constant value means I/O bound.
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lsof
§ List open files§ List open sockets§ List processes owning sockets/files§ Automatically kill processes etc...§ fuser(8) light weight tool part of the base operating
system
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lsof -I :23 :6000 etc…
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lsof /usr
page 68Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lslk
page 69Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
sched_stat
§ scheduler statistics– per CPU info– interrupt distribution– load balancing– documented in Tru64 5.1B
page 70Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Tru64 V5.1 Distributed Interrupts
cpu user nice system idle widle | scalls intr csw tbsyc-----+-------------------------------+-----------------------------------------
0 | 53.7 0.0 41.1 0.0 5.1 | 30129 75505 109576 15451 | 76.2 0.0 19.1 0.0 4.8 | 27322 0 46620 13952 | 75.9 0.0 19.3 0.0 4.8 | 27168 0 46117 15573 | 75.4 0.0 19.9 0.0 4.7 | 29494 0 46646 14844 | 56.3 0.0 42.0 0.0 1.7 | 28844 85065 100670 15715 | 76.8 0.0 21.4 0.0 1.9 | 34241 0 38533 15636 | 76.8 0.0 21.6 0.0 1.6 | 34484 0 38261 15737 | 76.7 0.0 21.3 0.0 1.9 | 35580 0 38650 15738 | 55.7 0.0 40.4 0.0 3.9 | 29341 70907 100808 15599 | 75.9 0.0 20.3 0.0 3.8 | 27306 0 39599 145110 | 74.0 0.0 22.2 0.0 3.7 | 35458 0 39223 106711 | 74.3 0.0 21.9 0.0 3.9 | 34119 0 39825 98212 | 57.0 0.0 41.8 0.0 1.2 | 36373 91000 101645 1574 13 | 77.7 0.0 21.1 0.0 1.2 | 49068 0 36888 157214 | 78.0 0.0 20.8 0.0 1.2 | 45700 0 36600 157215 | 77.1 0.0 21.7 0.0 1.2 | 43264 0 37136 1579
page 71Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
sched_stat -u
page 72Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
sched_stat –u sleep 10
page 73Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Kernel Profiling Tools
page 74Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Profiling Tools
§ Used to perform kernel and application level surgery
– lockinfo command to troubleshoot “hot” locks– kprofile to profile kernel– (D)CPI to profile kernel and application
page 75Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lockinfo –sort=misses sleep 20# lockinfo –sort=misses sleep 20hostname: rmcabs05locktype: MCS Lockslockmode: 2 (SMP default)processors: 32 start time: Sat Dec 9 16:08:27 2000end time:Sat Dec 9 16:08:47 2000command: sleep 20
tries reads trmax misses percent sleeps waitmax waitsummisses seconds seconds
cnode.c_statelock (M)2464194 0 0 1397731 56.7 0 0.00006 19.98978
processor.runq.lock (M)1241862 0 0 6971 0.6 0 0.00002 0.04308
BfAccessFreeLock (M)16506 0 0 204 1.2 0 0.00001 0.00103
lock.l_lock (M)2483207 0 0 114 0.0 0 0.00002 0.00113
msgQT.mutex (M)11105 0 0 82 0.7 0 0.00001 0.00041
pag.runq.lock (M)3950 0 0 60 1.5 0 0.00001 0.00028
kmembuckets.kb_lock (M)58781 0 0 29 0.0 0 0.00001 0.00012
vn_free_lock (M)6303 0 0 22 0.3 0 0.00001 0.00012
page 76Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lockinfo –class=cnode.c_statelock sleep 20
# lockinfo –class=cnode.c_statelock sleep 20hostname: rmcabs05locktype: MCS Lockslockmode: 2 (SMP default)processors: 32 start time: Sat Dec 9 16:16:21 2000end time:Sat Dec 9 16:16:41 2000command: sleep 20
Locks asserted by PC for lock class: cnode.c_statelock
count misses remote caller : line # return : line #------------------------------------------------------------------------------------------------------------------------------2119093 1669769 449307 cfsspec_write: 8738 vn_write: 1372
1053 0 912 cfsspec_read: 8581 vn_read: 1242700 0 473 cfspol_touchfile: 392 cfstok_req_one: 1441388 0 224 cfs_pfscacheread: 1852 cfs_read: 956356 0 192 cfs_read: 938 vn_read: 1242262 0 124 cfs_access: 3469 cfs_lookup: 4430232 0 102 cfs_inactive: 4037 vrele: 2603
73 0 53 cfspol_touchfile: 392 cfs_getattr: 272160 0 0 cfs_access: 3570 cfs_lookup: 443052 0 32 cfs_getattr: 2705 vn_stat: 1411
page 77Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lockinfo sleep 10
page 78Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
lockinfo –class=fsContext_mutexsleep 10
page 79Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
kprofile§ Enabling kernel profiling
– Load driver dynamically, must be unloaded BEFORE using DCPI
# sysconfig –c pfm§ To start profiling use
# kprofile –each -k /vmunix – to profile individual commands just add the command to the
command line# kprofile –k /vmunix rm
§ Creates a kmon.out in your current directory,– Analyze kprofile data using prof(8)
# prof /vmunix kmon.out
page 80Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
DCPI
§ Continuous Profiling Infrastructure – designed for continuous use on production
systems– intended for programmers and optimization tools– based on statistical sampling
§ Advantages– transparent, complete, efficient– produces accurate fine-grained information§ execution frequencies§ stall times & explanations
page 81Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
User Space Daemon
§ User-Space Daemon– extracts raw samples from driver– associates samples with load-files– updates disk-based profiles for load-files
§ Finding Load-Files from <PID, PC>– exec hook for statically linked load-files
§ Profiles– text header + compact binary samples
page 82Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Performance of Data Collection
§ Time– 1-3% total overhead for most workloads– often less than variation from run to run
§ Space– 512 KB kernel memory per processor– 2-10 MB resident for daemon– 10 MB disk after one month of profiling on
heavily used timeshared 4-processor machine
§ Non-intrusive enough to be run for many hours on production systems, e.g.
page 83Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Example: Getting the Big PictureTotal samples for event type cycles = 6095201
cycles % cum% load file
2257103 37.03% 37.03% /usr/shlib/X11/lib_dec_ffb_ev5.so1658462 27.21% 64.24% /vmunix928318 15.23% 79.47% /usr/shlib/X11/libmi.so650299 10.67% 90.14% /usr/shlib/X11/libos.so
cycles % cum% procedure load file
2064143 33.87% 33.87% ffb8ZeroPolyArc /usr/shlib/X11/lib_dec_ffb_ev5.so517464 8.49% 42.35% ReadRequestFromClient /usr/shlib/X11/libos.so305072 5.01% 47.36% miCreateETandAET /usr/shlib/X11/libmi.so271158 4.45% 51.81% miZeroArcSetup /usr/shlib/X11/libmi.so245450 4.03% 55.84% bcopy /vmunix209835 3.44% 59.28% Dispatch /usr/shlib/X11/libdix.so186413 3.06% 62.34% ffb8FillPolygon /usr/shlib/X11/lib_dec_ffb_ev5.so170723 2.80% 65.14% in_checksum /vmunix161326 2.65% 67.78% miInsertEdgeInET /usr/shlib/X11/libmi.so133768 2.19% 69.98% miX1Y1X2Y2InRegion /usr/shlib/X11/libmi.so
page 84Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
DCPI Install/Scan phase
§ Grab DCPI from public website and install using “INSTALL” scripthttp://www.tru64unix.compaq.com/dcpi
§ Can actually create multiple scans for different directories# dcpiscan –R /YourLocalDirectoryTree > local.map# dcpiscan [-R] $ORACLE_HOME/bin > oracle.map
§ Start data collection: (db is the directory you created for the data collection)
# dcpid –m local.map -m oracle.map db§ Get a new epoch ( this means that the collection is stopped and
restarted to a new set of collection data)# dcpiepoch
§ Stop the collection:# dcpiquit
page 85Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
DCPI - Analyze the data
# dcpiprof –iColumn Total Period (for events)------ ----- ------cycles 230954 63488imiss 179284 4096
The numbers given below are the number of samples for eachlisted event type or, for the ratio of two event types, theratio of the number of samples for the two event types.===========================================================cycles % cum% imiss % image224842 97.35% 97.35% 178556 99.59% /vmunix
3055 1.32% 98.68% 90 0.05% /var/subsys/lat.mod1884 0.82% 99.49% 455 0.25% /oracle/8.1.7/bin/oracle234 0.10% 99.59% 60 0.03% /bin/dcpid202 0.09% 99.68% 4 0.00% /oracle/8.1.7/bin/ogms176 0.08% 99.76% 15 0.01% /usr/local/bin/top147 0.06% 99.82% 32 0. 02% /usr/shlib/libc.so
page 86Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
DCPI - Analyze the data
# dcpiprof /vmunix Column Total Period (for events)------ ----- ------cycles 224842 63488imiss 178556 4096
The numbers given below are the number of samples for eachlisted event type or, for the ratio of two event types, theratio of the number of samples for the two event types.===========================================================cycles % cum% imiss % procedure image125138 55.66% 55.66% 67659 37.89% idle_thread /vmunix51395 22.86% 78.51% 107879 60.42% gh_zero_memor /vmunix36844 16.39% 94.90% 77 0.04% vm_page_teste /vmunix
794 0.35% 95.25% 7 0.00% pmap_zero_page /vmunix663 0.29% 95.55% 71 0.04% table /vmunix660 0.29% 95.84% 10 0.01% reg_save /vmunix441 0.20% 96.04% 5 0.00% find_bfap /vmunix426 0.19% 96.23% 32 0.02% nofault_bcopy /vmunix341 0.15% 96.38% 114 0.06% syscall /vmunix312 0.14% 96.52% 165 0.09% hardclock /vmunix268 0.12% 96.64% 70 0.04% _ Xsyscall /vmunix
page 87Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
DCPI - Analyze the data
# dcpiprof -m /map3.local /oracle/8.1.7/bin/oracleColumn Total Period (for events)------ ----- ------cycles 1884 63488imiss 455 4096
The numbers given below are the number of samples for eachlisted event type or, for the ratio of two event types, theratio of the number of samples for the two event types.===========================================================cycles % cum% imiss % procedure image
244 12.95% 12.95% 38 8.35% _OtsZero /oracle/8.1.7/bin/oracle136 7.22% 20.17% 14 3.08% kslwte /oracle/8.1.7/bin/oracle125 6.63% 26.80% 41 9.01% mux_wait /oracle/8.1.7/bin/oracle97 5.15% 31.95% 27 5.93% kslgetl /oracle/8.1.7/bin/oracle96 5.10% 37.05% 29 6.37% ksliwat /oracle/8.1.7/bin/oracle78 4.14% 41.19% 38 8.35% skgxpwait /oracle/8.1.7/bin/oracle77 4.09% 45.28% 11 2.42% kclpbi /oracle/8.1.7/bin/oracle76 4.03% 49.31% 12 2.64% ksbcti /oracle/8.1.7/bin/oracle55 2.92% 52.23% 14 3.08% skgxpte /oracle/8.1.7/bin/oracle51 2.71% 54.94% 3 0.66% wire_need_send /oracle/8.1.7/bin/oracle47 2.49% 57.43% 22 4.84% ksarcv /oracle/8.1.7/bin/oracle46 2.44% 59.87% 6 1.32% kjatwal /oracle/8.1.7/bin/oracle40 2.12% 62.00% 23 5.05% kslfre /oracle/8.1.7/bin/oracle40 2.12% 64.12% 9 1.98% ksimpoll /oracle/8.1.7/bin/oracle37 1.96% 66.08% 15 3.30% ksbabs /oracle/8.1.7/bin/oracle
page 88Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Phoenix
§Graphical User Interface to DCPI– Makes analysis much easier– Installed using setld(8)– Run using phoenix command– Report Generation– Selection of events to profile using the GUI
page 89Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Main Screen
page 90Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Epoch
page 91Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Epoch Detailed
page 92Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Epoch vmunix
page 93Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Phoenix Example
April 7 th 2003Performance Troubleshooting.ppt page 94© 2003
26th DECUS Symposium
www.decus.de
recommendedTru64 UNIX® settings
for ORACLE®vmipc
procrt
vfsadvfsrdg
page 95Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“vm” subsystem Tru64 UNIX® settings
Virtual Memory Subsystem– new_wire_method§ Should be set to ‘0’ on systems where§ ORACLE8 or ORACLE 8i/9i installed§ DirectIO used for filesystem I/O§ SSM memory is in use (ssm_threshold at default)
– ubc_maxpercent§ Should be set between 35% - 80%§ Usual rule of thumb would be somewhere between 50%-75%§ Setting it too low may cause contention in the UBC§ Setting it too high might cause memory being “wasted” to UBC
page 96Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“vm” Subsystem Tru64 SettingsVirtual Memory Subsystem– vm_ubcseqstartpercent§ Should be left at default in V5.1 (50%)§ Most likely being modified in V4.0x stream§ Algorithm changed in V5.1 to % of ubc_maxpercent§ In V4.0x stream % of managed pages
– vm_swap_eager§ Can be set to ‘0’ or ‘1§ Database server can potentially be run with ‘lazy’ swap
– rad_gh_regions§ Should be used on GS80/160/320 system for optimal performance§ Used to enable the use of “Large Pages” (4MB)§ Less tbmiss’es
page 97Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“ipc” Subsystem Tru64 Settings
Inter Process Communication Subsystem– shm_min§ Should be left at default value§ 9i Release Notes might be misleading§ shm_min instead of shm_mni
– shm_max§ Should be set to 2GB – 8MB (2139095040)§ V5.1 allows larger value but not proven/tested
– ssm_threshold§ If rad_gh_regions or gh_chunks in use should be set to ‘0’§ Enables/Disable use of SSM (Segmented Shared Memory)
Memory§ Default is SSM memory enabled (ie. set to 8MB)
page 98Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“ipc” Subsystem Tru64 Settings
Inter Process Communication Subsystem– shm_mni§ Should be set to 256§ Maximum number of shared memory regions which can active at
any given time§ If large SGA/Share Memory usage may want to increase to 1024
– shm_seg§ Should be set to 128§ Number of shared memory regions a single process can be
attached to§ For large scale environment may want to increase to 256 or 512
page 99Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“proc” Subsystem Tru64 Settings
Process Subsystem– (max_)per_proc_stack_size§ <= 512MB (actual limit is 3GB)§ Do not set value to >3GB , might cause SGA/stack corruption
– (max_)per_process_data_size§ <= available physical memory (~75% of PHYSMEM)§ Be careful to not over commit memory
– (max_)per_process_address_size§ <= available physical memory (~75% of PHYSMEM)§ Be careful to not over commit memory
– As of ORACLE 9i the {max_}per_proc_’mumble’ parameters should be at least set to the value of pga_aggregate_target
page 100Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“inet” Subsystem Tru64 Settings
Internet Subsystem– udp_sendspace§ Should be set to 65535§ Important for ORACLE/OPS operations using UDP as the IPC
protocol– udp_rcvspace§ Should be set to 65535§ Important for ORACLE/OPS operations using UDP as the IPC
protocol
http://www.tru64unix.compaq.com/docs/best_practices/BP_INTUNING/TITLE.HTMhttp://www.tru64unix.compaq.com/docs/best_practices/BP_GIGABIT/TITLE.HTM
page 101Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“rt” Subsystem Tru64 Settings
Realtime Subsystem–aio_task_max_num§ Should be set to at least 2048§ Only used if DirectIO/AIO is in use§ db_block_size is a good starting point
– or multiples of it (4096,8192,etc…)
§ Be careful in setting this parameter as you might cause I/O bottlenecks
page 102Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“vfs” Subsystem Tru64 Settings
Virtual Filesystem Subsystem– fifo_do_adaptive§ Should be set to 0§ Can provide significant performance increase (+20%)§ Important tuning feature if BEQ protocol is used
page 103Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“advfs” Subsystem Tru64 Settings
Advanced Filesystem Subsystem–AdvfsSync_MmapPages§ Should be set to 0§ Can provide significant performance increase§ If read-only mmap()’d files are in use almost a must§ Avoids premature flushing of mmap()’d pages
page 104Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“rdg” Subsystem Tru64 Settings
Reliable Datagram Subsystem– max_objs§ Should be set to at least 5 x the # of Oracle processes per
node§ At least to the larger of 10240 or § The number of Oracle processes multiplied by 70.
– msg_size§ Should be set to or greater than the maximum value of the
DB_BLOCK_SIZE parameter for the database.§ Recommended value of 32768 because Oracle 9i supports
different block sizes for each tablespace.
page 105Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
“rdg” Subsystem Tru64 Settings
Reliable Datagram Subsystem (cont)–max_async_req§ Should be set to at least 100§ Note: A value of 256 might provide better performance.
–max_sessions§ Should be at least the number of Oracle processes
plus 2– rdg_max_auto_msg_wires§ Must be set to 0
page 106Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
m e t h o d o l o g i e s
Come on! It can‘t go
wrong every time...
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
resources (documentation)
§ documentation– Tru64 V5.1 Documentation§ http://www.tru64unix.compaq.com/faqs/publications/pub_pag
e/V51_DOCS/V51_DOCLIST.HTM
– TruCluster V5.1 Documentation§ http://www.tru64unix.compaq.com/faqs/publications/cluster_d
oc/cluster_51/TCR51_DOC.HTM
– NUMA Overview§ http://www.tru64unix.compaq.com/faqs/publications/base_do
c/DOCUMENTATION/V51_HTML/NUMA/TITLE.HTM
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Resources (Documentation)
§ Documentation– Tru64/TruCluster Best Practice Documentation§ http://www.tru64unix.compaq.com/faqs/publication
s/best_practices/
page 109Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Online Resources
§ AdvFS Direct IO – Tru64 UNIX 5.1 doc set– http://www.tru64unix.compaq.com/faqs/publication
s/pub_page/doc_list.html– Direct IO (Synch): “Programmer's Guide” section
10.2.2– Direct IO (Asynch): “Realtime Programmer's
Guide” section 7.2
page 110Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Online Resources (cont)
§ CFS Direct IO – TruCluster Server 5.1– http://www.tru64unix.compaq.com/faqs/public
ations/cluster_doc/cluster_51/TCR51_DOC.HTM – Technical Overview section 2.2
page 111Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Online Resources
§ Tru64 UNIX NUMA Technical Overview– http://www.tru64unix.compaq.com/faqs/publications/b
ase_doc/DOCUMENTATION/V51_HTML/NUMA/TITLE.HTM
§ Tru64 UNIX V5.1 Base Operating System Documentation– http://www.tru64unix.compaq.com/faqs/publications/p
ub_page/V51_DOCS/V51_DOCLIST.HTM
§ TruCluster V5.1 Documentation– http://www.tru64unix.compaq.com/faqs/publications/cl
uster_doc/cluster_51/TCR51_DOC.HTM
page 112Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Online Resources (cont)
§ TruCluster ORACLE Best Practice Documentations§ Configuring a single instance ORACLE 8i installation on
TruCluster V5.1– http://www.tru64unix.compaq.com/faqs/publications/b
est_practices/BP_TCR_ORA_SS/TITLE.HTM
§ Configuring an ORACLE/OPS 8i installation on TruCluster V5.1
– http://www.tru64unix.compaq.com/faqs/publications/best_practices/BP_TCR_ORA_OPS817/TITLE.HTM
page 113Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Online Resources (cont)
§ORACLE 8i Website– http://www.oracle.com/ip/deploy/database/8i/index.htm
l?ee.html
§ORACLE Technet 8i Website– http://technet.oracle.com/products/oracle8i/
§ORACLE Magazine– http://www.oramag.com
§ORACLE Performance Website (like sys_check)– http://www.oraperf.com
April 7 th 2003Performance Troubleshooting.ppt © 2003
26th DECUS Symposium
www.decus.de
Resources (Tools)
§ tools repository– http://janix.zk3.dec.com/pub/tools/– http://www.tru64.org/~jmh/
§ wildfire tools– http://janix.zk3.dec.com/pub/tools/wf_tools/
§ Brad Nichols’ tools page– http://www.zk3.dec.com/~bnichols/