Date post: | 26-Oct-2014 |
Category: |
Documents |
Upload: | sunny-chen |
View: | 99 times |
Download: | 6 times |
© 2008 IBM Corporation
AIX Performance UpdatesTools & TunablesAIX 5.3 TL07, TL08, TL09AIX 6.1 TL01, TL02
Steve Nasypany
IBM Advanced Technical Support
IBM Advanced Technical Support
© 2008 IBM Corporation
Agenda
� SMT POWER5 vs POWER6
� AIX 5 vs AIX 6
– Tunables Framework
– VMM Tunings
� AIX 5.3 Tunables Updates
� Shared Ethernet
� Dedicated Processor Donation
� Virtual Shared Pools
� AIX 5.3 TL-09
– ‘nmon’ in AIX
– Topas VIOS/Adapter/MPIO
– svmon Reports
� POWER6 p575 & 595
IBM Advanced Technical Support
© 2008 IBM Corporation3
Agenda� AIX 6.1 TL01
–Workload Partitions Support• ps, ipcs, netstat, proc*, trace, vmstat, topas, tprof, filemon, netpmon, pprof, curt• Separate presentations available to cover WPAR specifics
–Restricted Tunables
–IO pacing–AIO
–CIO
–NFS biod
–JFS2 nolog–Multiple Page Size Segments - svmon
–iostat/topas - Filesystem and Workload Partition breakdowns (AIX 6)
� AIX 6.1 TL02–topas Memory Pool and Shared Ethernet monitoring
–svmon Reports
–filemon Reports
–mpstat/sar WPAR support–tprof Large Page and Data profiling
IBM Advanced Technical Support
© 2008 IBM Corporation
ST vs SMT in Micro partitions
� Dedicated Processor Partitions switch from symmetric multi-threaded mode (SMT) to single-threaded mode (ST) automatically at low multi-programming levels
– On POWER5, Micro Partitions do not switch SMT/ST modes automatically
– Micro Partitions may be configured to run in ST mode through the AIX smtctlcommand
� On POWER5, long-running single-threaded tasks can see their response time elongated in Micro partitions
– Effects of processor folding
– Effects of the secondary (idle) thread creating some interference for processor core resources
� POWER6 has a key technical improvement over POWER5 in multi-threading which dramatically reduces this SMT effect in Micro partitions
– On POWER6 Micro partitions do switch SMT/ST modes automatically
– On POWER6, on each cycle the hardware core may dispatch instructions for both hardware threads
IBM Advanced Technical Support
© 2008 IBM Corporation
ST vs SMT in Micro partitions – POWER6 example
� Generally, see perhaps 1% impact from running in SMT mode in Micro partitions on POWER6
� Example code from Northwestern University Minebench 1.0
� Shows the ratio of the test running in a Micro partition in SMT mode / ST mode
SMT/ST elapsed time
0.994475138
0.9 0.95 1 1.05 1.1
ScalParc
IBM Advanced Technical Support
© 2008 IBM Corporation
AIX 5.3 vs AIX 6.1 Framework� AIX 6.1 adopts common tunings by default and introduces restricted
tunables– Too many tunables, too much confusion– It just works
• Don’t change restricted tunables without direction from AIX service stream
• Carefully review software vendor specific recommendations. Often, they are just carrying over old/obsolete tunings from previous OS levels.
• Restricted tunables not displayed by default except by -o tunable• Use –F to force view or change
– If you update from AIX 5.3 to AIX 6.1, legacy tunings will be maintained• This is probably bad for any customer who hasn’t adopted memory
tunings used in last few years (lru_file_repage=0, etc)• Changes will be flagged in lastboot.log and errlog files during reboot• If you are using a tunable outside of the norm, and are unsure what to
do, open a PMR and ask– New set of SMIT panels to change restricted parameters
• Existing panels only show non restricted parameters
IBM Advanced Technical Support
© 2008 IBM Corporation
AIX 5.3 vs AIX 6.1
� Performance Differences– You should not see significant deltas between AIX 5.3 and AIX 6.1– CPU usage should be no more than a couple of percent either way– Memory footprints may be larger for applications using 64KB pages
• But 64KB page policy is very conservative, specifically to avoid large changes in memory utilization
IBM Advanced Technical Support
© 2008 IBM Corporation
� AIX 6.1
– minperm% = 3
– maxperm% = 90
– maxclient% = 90
– strict_maxperm = 0
– strict_maxclient = 1
– lru_file_repage = 0
– page_steal_method = 1
� Tunings on right are universally recommended for AIX 5.3
– And AIX 5.2, but limiting cache to no more than 24 GB
� Set-and-forget, lru_file_repage = 0 protects computational memory, always steal from cache
� No paging to the paging space will occur unless the system memory is over committed (AVM > 97%)
AIX 5 vs 6 VMM Page Replacement tuning
� AIX 5.2/5.3
– minperm% = 20
– maxperm% = 80
– maxclient% = 80
– strict_maxperm = 0
– strict_maxclient = 1
– lru_file_repage = 1
– page_steal_method = 0
IBM Advanced Technical Support
© 2008 IBM Corporation
lru_file_repage=0 Issues
� But now my system is ~100% memory usage…– New memory model results in free memory being consumed by cache– AIX does not actively scrub cache, as it is an expensive overhead
• AIX only looks for memory when it needs it– Customers do not know how to assess whether additional workloads can
be added without causing physical paging� There is no trivial method for knowing how much cache is optimal or active
for a given workload– Options on next slide
� If the system is paging to page space with these settings, you are memory bound– First, make sure you don’t have a memory leak– If you have to live with this workload, optimize your paging space
• Add paging spaces, spread them out• Paging spaces of equal sizes
IBM Advanced Technical Support
© 2008 IBM Corporation
Minimizing/Optimizing Cache with lru_file_repage=0
� Simple– DLPAR memory in as needed when workloads increase and paging occurs– Script filesystems to unmount/remount after workloads have completed, which will clear them
from cache– Use release-behind mechanisms
• Tells VMM data will not be operated on (no cache benefit)• read, write and read+write mount options• You need to know a little bit about your workloads behavior
� More work– Decrease maxclient/maxperm or deallocate memory to benchmark workloads
• Baseline current configurations vmstat ‘fi’ value• Reduce by 5%, allowing the system time to adjust• When the fi value sustains a significant increase, cache is likely constrained• Raise value 5%. Current computational (vmstat ‘avm’ or svmon ‘virtual’) and non-
computational (JFS: numperm, JFS2: numclient) totals should approximate current requirements
– If you have very different workloads, you’ll have to pick which one you want to tune to
� Difficult– Use svmon to identify files in cache, monitor I/O & database information
• svmon –jcS lists/sorts client pages and file information• filemon will give you file activity over short periods
� Punt– Adopt Direct I/O or Concurrent I/O
IBM Advanced Technical Support
© 2008 IBM Corporation
List-based LRU page_steal_method=1� Partition memory is broken up into page pools
– A page pool is a set of physical pages of the same size and form a list
– One lrud per memory pool
� When the free list is depleted, lrud scans the list for the typeof pages VMM desires (in buckets of 128K pages)
� Default page_steal_method = 0– Working storage and file pages mixed in one list– lrud scans sequentially to find pages of the right type
� List-based page_steal_method = 1– There are two lists for a page pool, one for working
storage and another for file pages
� The lru_file_repage effects which pages are stolen
– If lru_file_repage = 0, then it will steal from the file list. The higher the computational footprint, the better the scanning efficiency will be.
– If lru_file_repage = 1, then legacy repagingcounters/logic will determine which list is used
� List-based reduces CPU time due to less scanning
� This is NOT a dynamic tunable– Requires a bosboot/reboot to take effect– Is the AIX 6.1 default
Page Pool with page_steal_method = 1
List of w/s
pages
Page
scan for
w/s
List of file
pages
Page
scan for
file
IBM Advanced Technical Support
© 2008 IBM Corporation
New Tunables
� psm_timeout_interval = 5000
– Determines the timeout interval, in milliseconds, to wait for page size management daemons to make forward progress before LRU page replacement is started. This setting is only valid on the 64-bit kernel. Default: 5 seconds. Possible values: 0 through 60,000 (1 minute). When page size management is working to increase the number of page frames of a particular page size, LRU page replacement is delayed for that page size for up to this amount of time. On a heavily loaded system, increasing this tunable can give the pagesize management daemons more time to create more page frames before LRU runs.
– Basically, 64 KB page migrations can cause a deadlock between lrud and psmd
– vmo tunable
IBM Advanced Technical Support
© 2008 IBM Corporation
New Tunables
� JFS2 Sync Tunables (TL08)
– The file system sync operation can be problematic in situations where there is very heavy random I/O activity to a large file. When a sync occurs all reads and writes from user programs to the file are blocked. With a large number of dirty pages in the file the time required to complete the writes to disk can be large. New JFS2 tunables areprovided to relieve that situation.
IBM Advanced Technical Support
© 2008 IBM Corporation
New Tunables
– j2_syncPageCountLimits the number of modified pages that are scheduled to be written by sync in one pass for a file. When this tunable is set, the file system will write the specified number of pages without blocking i/o to the rest of the file. The sync call will iterate on the write operation until all modified pages have been written.Default: 0 (off), Range: 0-65536, Type: Dynamic, Unit: 4KB pages
– j2_syncPageLimitOverrides j2_syncPageCount when a threshold is reached. This is to guarantee that sync will eventually complete for a given file. Not applied if j2_syncPageCount is off.Default: 16, Range: 1-65536, Type: Dynamic, Unit: Numeric
– If application response times impacted by syncd, try j2_syncPageCount settings from 256 to 1024. Smaller values improve short term response times, but still result in larger syncs that impact reponse times over larger intervals.
– These will likely require a lot of experimentation, and detailed analysis of IO– Does not apply to mmap() or shmat() memory files.
IBM Advanced Technical Support
© 2008 IBM Corporation
New Tunables
� proc_disk_stats (TL08)
– There is a single process-wide structure that is updated for each I/O
– Structure is protected by a single lock: pv_lock_d
– More threads doing high I/O, the higher the potential for lock contention
• Should be easily visible by using splat lock tool
• Default behavior not changed. Turn off when process scope disk statistics not required
• Encountered in DB2 TPC-C benchmark tests
– schedo tunable
– APAR IZ12059
IBM Advanced Technical Support
© 2008 IBM Corporation
New Tunables
� large_receive (TL08)
– Shared Ethernet
– The 10 Gig adapter's LRO ("large receive offload") feature is enabled by default, and this may cause problems for a system configuration where a Shared Ethernet Adapter is bridging traffic for Linux LPARs (which cannot receive packets larger than their MTU).
– SEA will provide its own "large_receive" attribute, defaulted to "no", which will disable the feature in the underlying real adapter to avoid such problems out of the box. The user has the choice to override this and set the SEA's attribute to "yes" to enable the large receive feature in the underlying device (if available), overriding the device's own large_receive attribute setting
– SEA large_receive setting is dynamic as long as the adapter large_receive was enabled at boot. Otherwise adapter has to be recycled to support SEA change.
IBM Advanced Technical Support
© 2008 IBM Corporation
Shared Ethernet vs HEA on 10Gb� SEA has architectural limits with 10Gb adapters
– POWER5 limited by RIO-G/drawer bandwidth (~3 Gb/s)– POWER6 (1500 MTU)
• Send– large_send off 3 Gb/s– large_send on 8 Gb/s
• Receive– large_receive off 3 Gb/s– large_receive on ? Gb/s (no benchmark data available yet)
– No issues with 1Gb performance, just 10Gb– large_receive setting should allow SEA to be more competitive with HEA, but HEA is expected to
be higher performance
� Always use large_send, regardless of MTU size– HEA will buffer and break up packets automatically
� Use 266 MHz slots for 10Gb adapters as possible in heavy traffic environments
� Any VIOS entitlements must be increased– Need at least 2-3 CPUs to max out a 10Gb card
� Memory cost is ~150MB per LHEA port
� There are APARs in work for network dog-thread optimization issues (would impact customers with small packet sizes and packet counts in the 100K+/sec range). Expected in Q1/2009.
IBM Advanced Technical Support
© 2008 IBM Corporation
Shared Ethernet Tools
� seastat
– Shared Ethernet statistics, shipped in AIX 5.3 TL08
– Not Nigel’s tool
– CLI script in VIOS 1.5.2.1 executes command
– Device must be enabled for accounting statistics
� nmon 12 supports SEA reports
IBM Advanced Technical Support
© 2008 IBM Corporation
seastat
$ seastat -?
Usage: seastat -d <device name> -c
seastat -d <device name> [-n | -s searchtype=value]
$ chdev -dev ent8 -attr accounting=enabled
ent8 changed
$ seastat -d ent8
=============================================================================
Advanced Statistics for SEA
Device Name: ent8
=============================================================================
MAC: A6:3C:00:09:33:04
----------------------
VLAN: None
VLAN Priority: None
Hostname: js22aix.aixncc.uk.ibm.com
IP: 9.69.44.177
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 8 Packets: 18
Bytes: 646 Bytes: 1103
IBM Advanced Technical Support
© 2008 IBM Corporation
New mount option - noatime
� Ingo Molnar (Linux kernel developer) said:– "It's also perhaps the most stupid Unix design idea of all times. Unix is really
nice and well done, but think about this a bit: 'For every file that is read from the disk, lets do a ... write to the disk! And, for every file that is already cached and which we read from the cache ... do a write to the disk!'"
� If you have a lot of file activity, you have to update a lot of timestamps– File timestamps
• File creation (ctime)• File last modified time (mtime)• File last access time (atime)
– New mount option noatime disables last access time updates for JFS2
– File systems with heavy inode access activity due to file opens can have significant performance improvements
� APARs– IZ11282 AIX 5.3– IZ13085 AIX 6.1
IBM Advanced Technical Support
© 2008 IBM Corporation21
Dedicated Processor Donation (TL06 & POWER6)
� The ability of dedicated processor partitions to give unused compute cycles to the shared processor pool
� Using this feature has the effect of making the capacity of the shared pool variable
� Partitions configured in this way only donate cycles to the shared pool when physical processors in the partition are idle
– If the partition becomes > 80% busy under AIX, the partition ceases to donate cycles to the shared pool– Any I/O interrupt will result in the dedicated processor partition being redispatched if it had donated capacity– there is a guaranteee not to get phantom interrupts (interrupts for other partitions)– the partition keeps running on the same physical processors– must be enabled on HMC
� New phyp instrumentation collects– donated cycles
• voluntarily donated by an idle dedicated partition to shared pool– stolen cycles
• cycles stolen by phyp from a dedicated partition to run maintenance tasks (hypervisor)• can happen whether donation is enabled or not (just wasn’t instrumented before)
� Tools metrics impact– processors belonging to donating dedicated partitions are counted in pool size– PURR stops on context switches
• similar to what happens to shared partitions• tools will compensate so that dedicated percentages are still relative to total capacity
� Tools updated– lparstat, mpstat sar, topas and topasout reports
IBM Advanced Technical Support
© 2008 IBM Corporation
Dedicated Processor Donation – how to enable
IBM Advanced Technical Support
© 2008 IBM Corporation
Dedicated Processor Donation – where it fits in
� In some cases, dedicated processor partitions are warranted– Licensing or customer concerns …
– The need for extremely low I/O latency (<1 ms)
– The need for memory affinity or usage of RSETs– Scalability problems in applications spread over large numbers of virtual
processors
� Shared Dedicated Capacity allows the benefits of dedicated processor partitions, without locking down all of the capacity of processors in the partition
– Idle cycles can be used by uncapped partitions in the shared pool
� Shared Dedicated Capacity does not help with the footprint problem of requiring the sum of the entitlement of Micro partitions to be less than or equal to the number of processors in the shared pool
– Since Shared Dedicated Capacity donation to the shared pool is opportunistic, based on load
IBM Advanced Technical Support
© 2008 IBM Corporation24
Dedicated Processor Donation - lparstat
$ lparstat -i
Node Name : va01
Partition Name : va
Partition Number : 2
Type : Dedicated-SMT
Mode : Donating
Entitled Capacity : 1.00
Partition Group-ID : 32770
Shared Pool ID : -
Online Virtual CPUs : 1
Maximum Virtual CPUs : 1
Minimum Virtual CPUs : 1
Online Memory : 800 MB
Maximum Memory : 1024 MB
Minimum Memory : 128 MB
Variable Capacity Weight : -
Minimum Capacity : 1.00
Maximum Capacity : 1.00
Capacity Increment : 1.00
Maximum Physical CPUs in system : 4
Active Physical CPUs in system : 4
Active CPUs in Pool : -
# lparstat 1 3
System configuration: type=Dedicated mode=Donating smt=On lcpu=2 mem=800
%user %sys %wait %idle physc vcsw
---- ---- ---- ----- ---- -------
0.1 0.4 0.0 99.5 0.68 670234
0.0 0.2 0.0 99.8 0.68 670234
0.0 0.2 0.0 99.8 0.68 670234
shows actual physical processor consumption:
number of physical processors minus donated and stolen cycles
donation causes hardware context switches
Stay relative to partition capacity.
In this case one processor
IBM Advanced Technical Support
© 2008 IBM Corporation25
%idon, %bdon: percentages of idle and busy times donated
%istol, %bstol: percentages of idle and busy times stolen
Dedicated Processor Donation - lparstat details
� New -d flags shows more details
� Example with donation enabled
# lparstat –dSystem configuration: type=Dedicated mode=Donating smt=On lcpu=2 mem=800
%user %sys %wait %idle %idon %bdon %istol %bst ol----- ---- ----- ----- ------ ----- ----- ------
0.1 0.2 2.1 97.7 12.79 6.8 4.8 2.75
� Example without donation and in combination with -h
# lparstat -dhSystem configuration: type=Dedicated mode=Capped smt=On lcpu=2 mem=800
%user %sys %wait %idle %hypv hcalls %istol %bst ol----- ---- ----- ----- ----- ------ ------ ------
0.1 0.2 2.1 97.7 0.0 391 4.8 2.75
IBM Advanced Technical Support
© 2008 IBM Corporation26
Dedicated Processor Donation - sar and mpstat
� sar
– automatically displays phyc when donation is enabled
� mpstat
– automaticaly displays pc and lcs if donation is enabled
– new -h option to show more details on hypervisor related statistics
• donation enabledSystem configuration: lcpu=2 mode= Donating
cpu pc ilcs vlcs idon bdon istol bstol
0 0.3 50327 687231635 10.2 4.5 0.59 0.32
1 0.5 61702 684989764 10.2 4.5 0.59 0.32
ALL 0.8 112029 1372221399 20.4 9.0 1.18 0.64
• donation disabledSystem configuration: lcpu=2 mode= Capped
cpu pc ilcs vlcs istol bstol
0 0.3 503727 687231635 0.59 0.32
1 0.41 61702 684989764 0.59 0.32
ALL 0.71 565429 1372221399 1.18 0.64
• shared partitionSystem configuration: lcpu=2 ent=0.5 mode= Uncapped
cpu pc ilcs vlcs
0 0.6 503727 687231635
1 0.6 61702 684989764
ALL 0.8 565429 1372221399
idon, bdon: percentages of idle and busy times donated
istol, bstol: percentages of idle and busy times stolen
IBM Advanced Technical Support
© 2008 IBM Corporation27
Dedicated Processor Donation - topas -L
Interval: 2 Logical Partition: Fri Sep 2209:01:46 2006
Donating SMT ON Online Memory: 3200.0
Partition CPU Utilization Online Virtual CPUs: 1 Online Logical CPUs: 2
%user %sys %wait %idle %hypv hcalls %istl %bstl %idon %bdon vcsw
1 1 0 98 1 200 0 2.1 3.5 10.0 1.0
IBM Advanced Technical Support
© 2008 IBM Corporation28
Dedicated Processor Donation - topas -C� Example of topasout report for CEC recording
Report: Topas CEC Detailed --- hostname: ptoolsl1 version: 1.2
Start:02/09/06 06.30.00 Stop:02/09/06 07.30.00 In t:60 Min Range: 600 Min
Partition Info Memory (GB) Processors Avail Pool: 1.3
Monitored : 8 Monitored : 0.0 Monitored : 7 Shr Physical Busy: 2.2
UnMonitored: - UnMonitored: 0.0 UnMonitored: 0 Ded Physical Busy: 0.4
Shared : 6 Available :32.0 Available : 7 Donated Physical CPUs:0.7
Uncapped : 1 UnAllocated: - UnAllocated: 1 Stolen Pysical CPUs: 0.1
Capped : 7 Consumed : 8.7 Shared : 4 Hypervisor
Dedicated : 2 Dedicated : 3 Virt. Context Switch:332
Donating : 2 Donated : 1 Phantom Interrupts : 2
Pool Size : 2
Host OS M Mem InU Lp Us Sy Wa Id PhysB V csw Ent %EntC PhI
--------------------------------shared------------- -----------------------------
ptools1 A53 u 1.1 0.4 1 15 3 0 82 1.30 200 0.50 22.0 5
ptools5 A53 U 12 10 2 12 3 0 85 0.20 121 0.25 0.3 3
ptools3 A53 C 5.0 2.6 2 10 1 0 89 0.15 52 0.25 0.3 2
ptools7 A53 c 2.0 0.4 1 0 1 0 99 0.05 2 0.10 0.3 2
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw %istl %bstl %bdon %idon
------------------------------dedicated------------ -----------------------------
ptools4 A53 D 0.6 0.3 2 12 3 0 85 0.60 110 1 2 0 5
ptools6 A52 d 1.1 0.1 1 11 7 0 82 0.50 50 10 5 10 0
ptools8 A52 1.1 0.1 1 11 7 0 82 0.50 5 0 1 - -
ptools2 A52 1.1 0.1 1 11 7 0 82 0.50 4 0 2 - -
Time: 07.30.00 -----------------------------------------------------------------
donating partitions
donated processors
stolen cycles
donated cycles
IBM Advanced Technical Support
© 2008 IBM Corporation
# iostat –D 10
hdisk1 xfer: %tm_act bps tps bread bwrtn
87.7 62.5M 272.3 62.5M 823.7
read: rps avgserv minserv maxserv timeouts fails
271.8 9.0 0.2 168.6 0 0
write: wps avgserv minserv maxserv timeou ts fails
0.5 4.0 1.9 10.4 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
1.1 0.0 14.1 0.2 1.2 2374
Virtual adapter’s extended throughput report (-D)
Metrics related to transfers (xfer:)tps Indicates the number of transfers per second issued to the adapter.recv The total number of responses received from the hosting server to this adapter.sent The total number of requests sent from this adapter to the hosting server.partition id The partition ID of the hosting server, which serves the requests sent by this adapter.
Adapter Read/Write Service Metrics (read:)avgserv Indicates the average time. Default is in milliseconds.minserv Indicates the minimum time. Default is in milliseconds.maxserv Indicates the maximum time. Default is in milliseconds.
Adapter Wait Queue Metrics (wait:)avgtime Indicates the average time spent in wait queue. Default is in milliseconds.mintime Indicates the minimum time spent in wait queue. Default is in milliseconds.maxtime Indicates the maximum time spent in wait queue. Default is in milliseconds.avgwqsz Indicates the average wait queue size.qvgsqsz Indicates the average service queue size – Waiting to be sent to the disk.sqfull Indicates the number of times the service queue becomes full.
Can’t exceed queue_depth for the disk
If this is often > 0, then
increase queue_depth
I/O Monitoring with iostat – Service Times (Review)
Earlier AIX 5.3 levels may report sqfull as a delta, but APARs fixes convert to rate, so values will be much smaller
Default format hard to read with many hdisks. Use –l option for wide output
Service Time Goals
Reads < 20 msecs
Writes
with cache < 2 msecs
w/o cache < 10 msecs
IBM Advanced Technical Support
© 2008 IBM Corporation30
iostat tape support (TL-07)� Uses existing dkstat structures to store metrics
–same as disk devices
–includes support for service time monitoring
–but there is no queuing, so no wait metrics
� Initially only ATAPE devices are going to be supported
� Detailed output example (-p for tapes)
# iostat –Dp 1 1
System configuration: lcpu=1 tapes=1 drives=1 paths=2 vdisks=0
Rmt0 xfer: %tm_act bps tps bread bwrtn
1.0 5.8K 1.4 799.0 5.0K
Read: rps avgserv minserv maxserv timeouts fails
0.1 6.6 0.1 53.8 0 0
write: wps avgserv minserv maxserv timeouts fails
1.3 8.2 0.9 113.7 0 0
IBM Advanced Technical Support
© 2008 IBM Corporation
Virtual Shared Processor Pools (POWER6 & TL07)
� Description– Allows user to set capacity limits on groups of LPAR’s– A shared processor pool has two settings
• Maximum capacity – limit on total capacity LPAR’s in pool can consume• Reserved entitled capacity – reserved uncapped entitled capacity
– Primary motivation is reduced licensing costs• Uncapped partitions can be capped to virtual pool’s limit rather than total
number of physical processors in pool
� Configuration– Up to 64 pools are supported– Pool 0 is default pool
• Pool 0 is equivalent to the physical shared processor pool– All attributes of a pool can be changed dynamically– LPAR’s can be re-assigned to different pools dynamically
IBM Advanced Technical Support
© 2008 IBM Corporation
Virtual Shared Processor Pools
n2
AIX
DB2
n3
Linux
987654321111
Physical Shared Pool (9 processor cores)
n1
i5/OS
n6
Uncapped
Linux
WAS
VP = 4
Ent. = 2.00
n5
Uncapped
AIX
DB2
VP = 4
Ent. = 1.7
n8
Uncapped
AIX
WAS
VP = 3
Ent. = 1.00
n7
Uncapped
i5/OS
WAS
VP = 7
Ent. = 2.00
n4
Uncapped
AIX
DB2
VP = 4
Ent. = 1.80
POWER6 Multiple shared pools:
• Can reduce the number of software licenses by putting a limit on the amount of processors an uncapped partition can use
• Up to 64 shared pools
Virtual Shared pool #1 Max Cap: 5 processors
Virtual Shared pool #2 Max Cap: 6 processors
DB2 cores to license:• 1 from dedicated partition n2• 5 from pool 1= 6
WebSphere cores to license:• 6 from pool 2= 6
Server with 12 processor cores
IBM Advanced Technical Support
© 2008 IBM Corporation
Virtual Shared Processor Pools
� Hardware Requirements
– POWER6 or later
– HMC-managed
• Virtual shared processor pools are not supported with IVM
� Software Requirements
– eFW 3.2 or later
– AIX 5.3 TL07 or later
– AIX 6.1 or later
IBM Advanced Technical Support
© 2008 IBM Corporation
Enable Monitoring of the shared pool usage
� Surprisingly, many customer do not seem to be prepared for monitoring the shared pool
� Make sure at least one partition on the CEC can do pool monitoring!
� Required for lparstat to see free pool resources, but topas gets around this because it can collect data from remote agents and calculate itself
IBM Advanced Technical Support
© 2008 IBM Corporation35
Multiple shared pools (topas –C)� New pool section
– Turned on by using “p” on any topas CEC panel
• Short, long and no header options
–Cursor and “f” key trigger focus on single pool
• Lists shared partitions using that virtual pool
pool psize ent maxc physb app mem inu
1 8 6.5 12.0 4.8 3.2 128 80.5
2 8 5.0 8.0 2.1 5.9 64 55.3
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw Ent %EntC PhI
-------------------------------------shared-------------------------------
ptoolsl1 53 U 3.1 1.9 4 1 2 0 96 0.01 398 0.2 0 5.3 0
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw %istl %bstl %bdon %idon
------------------------------------dedicated-----------------------------
ptools1 61 D 3.1 0.9 2 0 0 0 99 0.00 177 - - 0 20
ptoolsl3 61 S 3.1 0.9 2 0 0 0 99 0.00 170 - - - -
psize = pool size (effective capacity)
physb = shared physB
ent = entitlement
maxc = maximum capacity
app = available pool processors
mem = memory
inu = memory in use
IBM Advanced Technical Support
© 2008 IBM Corporation
Overview of topas / nmon / topasrec
� AIX 5.3 TL09 and AIX 6.1 TL02
� topas is a curses based tool used to monitor various performanceparameters (statistics) of the system. Supported with the operating system since AIX 4.3.
� nmon is also a curses based tool for System Performance monitoring and also has recording capabilities. Developed by Nigel Griffiths (IBM).
� Development has integrated nmon-like functionality into AIX– Legacy topas and nmon options supported– Legacy recording formats supported (input into nmon Analyser, etc)
� topasrec is a new tool used to start topas local / CEC recording in binary format– AIX Local recordings previously used xmwlm agent– AIX CEC recordings previously used topas with –R option
IBM Advanced Technical Support
© 2008 IBM Corporation
'nmon' in AIX
� Can be started by running command 'nmon' or ‘topas_nmon’
� Can be started by pressing “~” from topas screen
./topas_nmon -h
Hint: topas_nmon [-h] [-s <seconds>] [-c <count>] [-f -d -t -r <name>] [-x]Command: TOPAS-NMON
-h FULL help information - much more than hereInteractive-Mode:read startup banner and type: "h" once it is runningFor Data-Collect-Mode (-f) -f spreadsheet output format [note: default -s300 -c288]optional-s <seconds> between refreshing the screen [default 2]-c <number> of refreshes [default millions]-t spreadsheet includes top processes-x capacity planning (15 min for 1 day = -fdt -s 900 -c 96)
For Interactive-Mode-s <seconds> between refreshing the screen [default 2]-c <number> of refreshes [default millions]-g <filename> User decided Disk Groups
- file = on each line: group_name <hdisk_list> space separated- like: rootvg hdisk0 hdisk1 hdisk2- upto 32 groups hdisks can appear more than once
-b black and white [default is colour]-B no boxes [default is show boxes]example: topas_nmon -s 1 -c 100
IBM Advanced Technical Support
© 2008 IBM Corporation
Initial Screen of nmon
� Shows resources
IBM Advanced Technical Support
© 2008 IBM Corporation
Help Screen in nmon
IBM Advanced Technical Support
© 2008 IBM Corporation
Top process Panel in nmon
� Enter “t” to see top processes
IBM Advanced Technical Support
© 2008 IBM Corporation
CPU utilization Panel in nmon
� Enter 'c' to toggle on CPU utilization panel
IBM Advanced Technical Support
© 2008 IBM Corporation
Disk Utilization Panel in nmon
� Enter 'd' to turn on Disk utilization panel
IBM Advanced Technical Support
© 2008 IBM Corporation
Partition Details Panel in nmon
� Enter 'p' to turn on partition details panel
IBM Advanced Technical Support
© 2008 IBM Corporation
Multiple Panels in one screen
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon
� Following are the different options available for recording in nmon
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon
� New command topasrec is introduced to do local / CEC binary topas recordings
� The naming conventions of the generated recording i s as follows:
– Nmon Style Recording (Custom recording)
• hostname_yymmdd_hhmm.nmon
– Nmon Style Recording (Persistent recording)
• hostname_yymmdd.nmon
– Binary Style Recording (Custom recording)
• hostname_yymmdd_hhmm.topas
– Binary Style Recording (Persistent recording)
• hostname_yymmdd.topas
– CEC Recording (Custom recording)
• hostname_cec_yymmdd_hhmm.topas
– CEC Recording (Persistent recording)
• hostname_cec_yymmdd.topas
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
� New Smit Panels introduced to operate on topas recordin gs. Options are provided:
– To start / stop persistent recording ( 24 x7 )
– To start / stop WLE data collection
– To choose type of recording:
• Binary / Nmon Style Local recording• CEC recording
– List Available / Completed recordings
– Generate reports on the completed recordings
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
Recording using topas / nmon (Contd.,)
IBM Advanced Technical Support
© 2008 IBM Corporation
VIOS Monitoring using topas
� Run topas -C and press 'v' to show the VIOS Monitoring Panel
� All systems must be at AIX TL09 or higher to be monitored
IBM Advanced Technical Support
© 2008 IBM Corporation
VIOS Monitoring using topas� From topas VIOS panel, move the cursor to a particular VIOS server and press 'd' to get
the detailed monitoring for that server
IBM Advanced Technical Support
© 2008 IBM Corporation
Topas Adapter / MPIO panel
� From topas Disk Panel, press 'd' to toggle on/off Adapter Panel, press 'm'
to toggle on/off Path panel.
IBM Advanced Technical Support
© 2008 IBM Corporation
svmon Report Enhancements (5.3 TL09)
� Reports
– A new option -O is added to change the content and presentation of the reports that the svmon command generates.
– Filtering and sorting options
– To overwrite the default values that are defined for the -O options flag, a user can define the .svmonrc configuration file in the directory where the svmon command is launched.
– -X option is added to generate reports in XML format
� RBAC Enablement (AIX 6.1 TL02 only) / Non-root user access
� Memory Affinity information
IBM Advanced Technical Support
© 2008 IBM Corporation
svmon Report Options (-O values)
� Following are the values that can be passed to -O option:
– activeusers=[on | off], affinity=[on | detail | off], commandline=[on | off],
– filename=[on | off], filtercat=[off exclusive kernel shared unused unattached],
– filterpgsz=[off s m L S], filterprop=[off notempty data text],
– filtertype=[off working persistent client], format=[80 | 160 | nolimit], frame=[on | off],
– mapping=[on | off], mpss=[on | off], overwrite=[on | off], pgsz=[on | off],
– pidlist=[on | number | off], process=[on | off], range=[on | off],
– segment=[on | category | off],
– shmid=[on | off], sortentity=[inuse | virtual | ....] (depending on the selected summary),
– sortseg=[inuse | pin | pgsp | virtual], subclass=[on | off], summary=[basic | longreal],
– svmonalloc=[on | off], timestamp=[on | off], unit=[auto | page | KB | MB | GB]
IBM Advanced Technical Support
© 2008 IBM Corporation
svmon Report Examples (-O option)
IBM Advanced Technical Support
© 2008 IBM Corporation
svmon Report Examples (-O option)
Unused work type segments
IBM Advanced Technical Support
© 2008 IBM Corporation61
POWER6 p575 & p595� Tools adjusted to use Scaled Processor Utilization Resource Register (SPURR)
– Measure of processor time dynamically scaled based on throttling or frequency slewing
• Caused by Thermal Power Management savings mode • Throttling – delays instruction processing by injecting dead cycles• Slewing – clock is able to dynamically adjust to other frequencies
– CPU tools updated to show processor rate (%npe)
• 100% no slewing or throttling• <100% percentage of nominal performance
– Adds another layer of complexity to determine utilization
� Dedicated Processor Folding
– Workloads consolidated onto as few processors as possible, equivalent to Virtual Processor Folding in shared environments
– mpstat –s is probably the only tool that can accurately detect this.
� Memory Throttling
– Larger DIMMs will be throttled, no tools can see this
– Implemented in POWER6 p575 and p595 platforms
– Not expected to be a major issue, but lack of measurement capability is a concern
IBM Advanced Technical Support
© 2008 IBM Corporation62
AIX 6.1� AIX 6.1 TL01
–Workload Partitions Support• ps, ipcs, netstat, proc*, trace, vmstat, topas, tprof, filemon, netpmon, pprof, curt• Separate presentations available to cover WPAR specifics
–Restricted Tunables
–IO pacing
–AIO
–CIO
–NFS biod
–JFS2 nolog
–Multiple Page Size Segments - svmon
–iostat/topas - Filesystem and Workload Partition breakdowns (AIX 6)
� AIX 6.1 TL02
–topas Memory Pool and Shared Ethernet monitoring
–filemon Reports
–mpstat/sar WPAR support
–tprof Large Page and Data profiling
IBM Advanced Technical Support
© 2008 IBM Corporation
Performance Tunables
� Tunables now in two categories
� Restricted Tunables
– Should not be changed unless recommended by AIX development or development support
– Are not shown by tuning commands unless the –F flag is used
– Dynamic change will show a warning message
– Permanent change must be confirmed
– Permanent changes will cause an error log entry at boot time
� Non-Restricted Tunable
– Can have restricted tunables as dependencies
IBM Advanced Technical Support
© 2008 IBM Corporation
Changing restricted tunables
�ioo -po aio_sample_rate=6Modification to restricted tunable aio_sample_rate, confirmation yes/no
> ioo -o aio_sample_rate=6Warning: a restricted tunable has been modified
�Changing a restricted tunable dynamically
A permanent change of a restricted tunable requires a confirmation from the user.
Note: The system will log changes to restricted tunable in the system error log atboot time.
A dynamic change of a restricted tunable will inform the user.
�Changing a restricted tunable permanently
IBM Advanced Technical Support
© 2008 IBM Corporation
List restricted tunables
> ioo -aF
aio_active = 0
aio_maxreqs = 65536
...
posix_aio_minservers = 3
posix_aio_server_inactivity = 300
##Restricted tunables
aio_fastpath = 1
aio_fsfastpath = 1
aio_kprocprio = 39
aio_multitidsusp = 1
aio_sample_rate = 5
aio_samples_per_cycle = 6
j2_maxUsableMaxTransfer = 512
j2_nBufferPerPagerDevice = 512
j2_nonFatalCrashesSystem = 0
j2_syncModifiedMapped = 1
j2_syncdLogSyncInterval = 1
IBM Advanced Technical Support
© 2008 IBM Corporation
TUNE_RESTRICTED Error Log EntryLABEL: TUNE_RESTRICTEDIDENTIFIER: D221BD55
Date/Time: Thu May 24 15:05:48 2007Sequence Number: 637Machine Id: 000AB14D4C00Node Id: quakeClass: OType: INFOWPAR: GlobalResource Name: perftune
DescriptionRESTRICTED TUNABLES MODIFIED AT REBOOT
Probable CausesSYSTEM TUNING
User CausesTUNABLE PARAMETER OF TYPE RESTRICTED HAS BEEN MODIFIED
Recommended ActionsREVIEW TUNABLE LISTS IN DETAILED DATA
Detail DataLIST OF TUNABLE COMMANDS CONTROLLING MODIFIED RESTRICTED TUNABLES AT REBOOT, SEE FILE /etc/tunables/lastboot.log
IBM Advanced Technical Support
© 2008 IBM Corporation
Why you ask?
� The number of tunables in AIX had grown to a ridiculously large number
– 5.3 TL06: vmo 61, ioo 27, schedo 42, no 135, plus a few others
– 6.1 vmo 29, ioo 21, schedo 15, no 133, plus a few others
� The potential combinations that exist are too huge to effectively test and document
� Many of the tunables had been created to deal with very specificcustomers or situations which don’t apply often
� This wasn’t done in a vacuum, a survey of support and recent situations was employed to identify the commonly used tunables (which remain unrestricted)
� If a restricted tunable must be changed, a PMR should be opened to identify the issue
IBM Advanced Technical Support
© 2008 IBM Corporation
Implementation Considerations
�Best Practices�Do not apply legacy tuning since some tunables may now be restricted
�If you do an upgrade install, your old tunings will be preserved�You may wish to undo them, but we won’t make you
�This level of tune was been applied to numerous AIX 5.3 customers through field support
�We are confident this was a good thing
�However, we try to never change defaults in the service stream, so AIX 5.3 remains as it was
�Change restricted tunables only if recommended by AIX support
IBM Advanced Technical Support
© 2008 IBM Corporation
Implementation Considerations (Cont’d)
�Problem Determination�Common problems - seen in field or lab
�Legacy VMM tuning results in error log entries (TUNE_RESTRICTED)
�Tuning scripts fail due to required confirmation for permanent changes of restricted tunables
�Install/tuning scripts fail due missing aio0 device�Diagnostics
�Check AIX errpt for TUNE_RESTRICTED
�Check /etc/tunables/lastboot.log�PERFPMR
IBM Advanced Technical Support
© 2008 IBM Corporation
VMM File IO Pacing Enabled By Default
� IO Pacing Enabled By Default– Prevents system responsiveness issues due to large quantities of
writes– Limits the maximum number of pages of I/O outstanding to a file
• Without I/O pacing a program can fill up large amounts of memory with written pages. Those “queued” I/O’s can result in long waits for other programs using the storage
• Better solution than the file system write behind techniques– New defaults
• Not very aggressive, intended to limit one or a few programs from impacting system responsiveness. Values high enough not to impact sequential write performance
• maxpout = 8193• minpout = 4096
IBM Advanced Technical Support
© 2008 IBM Corporation
AIO Support
� Interface Changes
– All the AIO entries in the ODM and AIO smit panels have been removed
– The aioo command will not longer be shipped
– All the AIO tunables have current, default, minimum and maximum value that can be viewed with ioo
� AIO kernel extension loaded at system boot
– Applications no longer fail to run because you forgot to load the kernel extension (you may applaud here)
– No AIO servers are active until requests are present
– Extremely low impact on memory requirements with this implementation
IBM Advanced Technical Support
© 2008 IBM Corporation
Improvements to AIO CIO
� AIO Fast Path for CIO enabled by default– With the fast path, the AIO server
threads no longer participate in the I/O path
– By removing the AIO servers from the path, we get three things• The removal of AIO servers as
any potential resource bottleneck• The reduction in path length for
AIO read/write services, as less dispatching is required
• Potentially better coalescing of sequential I/O requests initiated through AIO or LISTIO services
� Fast Path enabled for LV and PV’s for a long time– No change in behavior for
environments such as Oracle 10G/ASM on raw hdisks
Application
File System File System
LVM
Device Driver
Application
File System
LVM
Device Driver
AIO ServerApplication
FS no Fast Path
CIO Fast Path
IBM Advanced Technical Support
© 2008 IBM Corporation
General improvements to AIO
� The number of AIO servers varies between minservers and maxservers (times #CPUs), based on workload
– AIO servers stay active as long as they service requests
– Number of AIO server dynamically increased/reduced based on the demand of the workload
– aio_server_inactivity defines after how many seconds idle time an AIO server will exit
– Do not confuse no active servers with kernel extension not loaded. The kernel extension is always loaded
� Changes to AIO tunables are dynamic through ioo
– Changes do not require system reboot– minservers is changed to a per CPU tunable
– maxservers is changed to 30
– maxreqs is changed to 65536
� Benefit
– No longer necessary to tune the minservers/maxservers/maxreqs as in the past
IBM Advanced Technical Support
© 2008 IBM Corporation
CIO Read Mode Flag
� Allows an application to open a file for CIO such that subsequent opens without CIO avoid demotion
– In the past, a 2nd opening of a file without CIO, would cause “demotion” which removes many of the benefits of CIO
– The 2nd read-only opening without CIO will still result in that opening having uncached reads to the file. Thus, such programs should ensure that the I/O sizes are large enough to achieve I/O efficiency
� Example, a backup application can access database files in read only mode while the database has the file opened in concurrent IO mode
� open() flag is O_CIOR
� procfiles does not reflect O_CIO/O_CIO_R currently
– kdb 'u <slotnumber>' then for each file listed there 'file <filepointer>' gives some info
IBM Advanced Technical Support
© 2008 IBM Corporation
NFS Performance Improvements
� RFC 1323 enabled by default– Allows for TCP window scaling beyond 64K, so more one-way packets
in-flight allowed between acks for large sequential transfers. We had the nfs_rfc1323 tunable before, it just wasn't enabled by default.
� Increase default number of biod daemons– 32 biod daemons per NFS V3 mount point– Very slight increase in memory (<2MB) required over previous default
of 4– Enables more I/O’s to be outstanding at the same, doesn’t speed
sequential operations much, but helps random access (e.g. OLTP)
� Default read/write size increased to 64k for TCP connections– Was 32k previously
IBM Advanced Technical Support
© 2008 IBM Corporation
NFS biod changes
� Having more biod’s allows better read-ahead and write-behind
� However, measured on a single-process basis, don’t have huge performance differences over the AIX 5.3 defaults
� Results should improve in tests with multiple processes/threads operating over NFS
� NFS client tests, p5 520 on 1GB Ethernet with 64kB I/O’s (next slide)
IBM Advanced Technical Support
© 2008 IBM Corporation
NFS biod changes
NFS single process throughput, over 256MB file
0
20000
40000
60000
80000
100000
120000
read
seq
serv
er u
ncach
ed
read
seq
serv
er ca
ched
read
rand
ser
ver u
ncac
hed
write
seq
over
write
write
seq
crea
tewrit
e ra
nd c
reat
e
MB
/sec
ond
32biod4biod
IBM Advanced Technical Support
© 2008 IBM Corporation
NFS biod change with Kerberos krbp5
� The increase in biod’s has a much more positive impact when using Kerberos DES security
� Overlapping more compute with network traffic through more biod’s greatly improves throughput
� Same model as previous chart, krbp5 (full packet encryption) mount option
NFS biod changes with Kerberos
0
10000
20000
30000
40000
50000
60000
70000
read
seq
serv
er u
ncach
ed
read
seq
serv
er ca
ched
read
rand
ser
ver u
ncac
hed
write
seq
over
write
write
seq
crea
tewrit
e ra
nd c
reat
e
MB
/sec 32biod
4biod
IBM Advanced Technical Support
© 2008 IBM Corporation
Enhanced JFS2 “nolog” option
� JFS2 standard metadata logging for filesystem integrity disabledvia a mount option
– Similar to legacy JFS “nointegrity option”
� Meant to enable faster migration of data to new storage
– File system operation with heavy file create/delete activity cancreate log bottlenecks
– Potentially useful for temporary file systems where the filesystem can be easily recreated or fsck’ed
� Mount –o log=NULL during data migration phase, then unmountand mount with standard logging
IBM Advanced Technical Support
© 2008 IBM Corporation
Enhanced JFS2 “nolog” option - example
� 4-way POWER5 p550, PHP test “Wikibench”
� Test makes heavy use of file meta-data
� With single disk setup, bottleneck on disk writes to Enhanced JFS2 logs
� With “nolog”, the log bottleneck is avoided
Disk utilization over time
0
20
40
60
80
100
time
%di
sk b
usy
default log
nolog
PHP Wikibench
0102030405060708090
Default log nolog
Thr
ough
put
IBM Advanced Technical Support
© 2008 IBM Corporation
Multiple Page Size Segment (MPSS) Support (6.1 TL01)
� POWER6 provides hardware support for mixing 4kB pages and 64kB pages in the same hardware segment
� This allows the AIX operating system to transparently promote small pages to medium pages
– This typically improves performance by reducing stress on hardware translation mechanisms
– Controlled with the vmo vmm_default_pspa parameter (-1 turns off)
� This behavior is enabled as a default on AIX 6.1 on POWER6 hardware– Since it is not supported on POWER5, systems running identical
application conditions on POWER5 and POWER6 may differ on exact memory page usage
– In general, no increase in memory consumption should be noticed,however the usage of 64kB pages may increase on POWER6
– System paging activity may result in 64kB pages being broken into 4kB pages
– 64kB pages that are broken by paging won’t usually be reconstituted into 64kB pages later
IBM Advanced Technical Support
© 2008 IBM Corporation82
svmon Mixed Page Sizes (6.1 TL01)� AIX 6.1 will dynamically collapse sets of 4K pages into 64K pages
– creates mixed page size segments
� Short reports update
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
1c8a6 2 work process private s 81 3 0 81
2869 2 work process private s 81 3 0 81
12881 2 work process private s 81 3 0 81
14842 2 work process private s 81 3 0 81
e7cf f work shared library data sm 69 0 0 69
� Long (-l) reports update
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
1c8a6 2 work process private s 81 3 0 81
2869 2 work process private s 81 3 0 81
12881 2 work process private s 81 3 0 81
14842 2 work process private s 81 3 0 81
e7cf f work shared library data s 5 0 0 5
m 4 0 0 4
IBM Advanced Technical Support
© 2008 IBM Corporation
svmon MPSS detail
svmon –D d3a7
Segid: d3a7
Type: working
PSize: sm (4 KB - 64 KB)
Address Range: 0..4095
Size of page space allocation: 3744 pages ( 14.6 MB)
Virtual: 4096 frames (16.0 MB)
Inuse: 582 frames ( 2.3 MB)
Page Psize Frame Pin ExtSegid ExtPage
0 m 442176 Y - -
1 m 442177 Y - -
2 m 442178 Y - -
382 s 362140 N - -
435 s 430534 N - -
IBM Advanced Technical Support
© 2008 IBM Corporation84
iostat File System (6.1 TL01)� Available in AIX 6.1
- f to specify system and hdisk utilization (below)
- F to just display file system activity
System configuration: lcpu=2 drives=2 ent=0.50 path s=2 vdisks=2 fs=9
tty: tin tout avg-cpu: % user % sys % idle % iowait physc % entc
0.0 72.0 39.0 4.9 53.8 2.3 0.2 46.0
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 37.0 3897.0 70.0 0 3897
hdisk1 50.0 3897.0 70.0 0 3897
FS Name: % tm_act Kbps tps Kb_read Kb_wrtn
/ - 3.7 2.0 3 0
/usr - 0.0 0.0 0 0
/var - 0.0 0.0 0 0
/tmp - 43.8 968.0 0 43
/home - 0.0 0.0 0 0
/admin - 0.0 0.0 0 0
/proc - 0.0 0.0 0 0
/opt - 0.0 0.0 0 0
IBM Advanced Technical Support
© 2008 IBM Corporation85
topas File System (6.1 TL01)
� Available in AIX 6.1
- f to specify number of monitored file systems on main screen at startup
- F to start file system view on startup or (F) to toggle screen display
Topas Monitor for host: ec03 Interval: 2 Sun Jul 20 19:21:21 2008
=================================================== =============================
FileSystem KBPS TPS KB-R KB-W Open Crea te Lock
/tmp 47.0 967.0 0.0 47.0 0 0 0
/var 10.0 0.0 202.0 0.0 0 0 0
/usr 0.0 0.0 0.0 0.0 0 0 0
/ 0.0 0.0 0.0 0. 0 0 0 0
/home 0.0 0.0 0.0 0.0 0 0 0
/audit 0.0 0.0 0.0 0.0 0 0 0
/admin 0.0 0.0 0.0 0.0 0 0 0
/proc 0.0 0.0 0.0 0.0 0 0 0
/opt 0.0 0.0 0.0 0.0 0 0 0
IBM Advanced Technical Support
© 2008 IBM Corporation
topas Memory Pool (6.1 TL02) � From topas CEC panel, press 'm' to view Memory Pool Panel and press 'f'
focusing on a memory pool to view the partition level usage for the selected memory pool
IBM Advanced Technical Support
© 2008 IBM Corporation
topas Shared Ethernet Adapter (6.1 TL02)
� Press E to display the Shared Ethernet Adapter(SEA) on a Virtual I/O Server.
IBM Advanced Technical Support
© 2008 IBM Corporation
mpstat/sar WPAR Support (6.1 TL02)
� Commands mpstat / sar are enabled to show statistics when invoked within a WPAR
� -@ option is added to mpstat / sar to collect and display statistics of a specified WPAR from Global environment
� New field 'rset' is added to the Configuration line of the mpstat / sar report to indicate the type of rset that a particular WPAR is associated with.
� A new row with cpuid “R” is added to per processor utilization report of mpstat / sar. The “R” row will show the RSET level utilization.
� Disk statistics are not available inside WPAR, hence sar will not report disk statistics inside WPAR
IBM Advanced Technical Support
© 2008 IBM Corporation
mpstat / sar (Contd.,)
� To view processor statistics of the processor that belongs to the rset associated with a specified WPAR. Run mpstat -@ <wparname>
� Invoking mpstat inside a WPAR to view statistics for all the processors in the system
� Invoking mpstat inside a WPAR to view SMT threads
IBM Advanced Technical Support
© 2008 IBM Corporation
mpstat / sar (Contd.,)
� Invoking sar inside WPAR to view RSET processor statistics. The Red Circled CPU ID ('R') provides the RSET level utilization
� Invoking sar inside WPAR to view all processor statistics. The Red Circled CPU ID with a prefix '*' indicates that the CPU is associated with the RSET used by the WPAR
IBM Advanced Technical Support
© 2008 IBM Corporation
filemon Report Enhancements (6.1 TL02)
� New filtering options are added to -O option of filemon to generate new type of report– lf[=num]: monitor logical file I/O and display first num records where num > 0
– vm[=num]: monitor virtual memory I/O and display first num records where num > 0
– lv[=num]: monitor logical volume I/O and display first num records where num > 0
– pv[=num]: monitor physical volume I/O and display first num records where num > 0
– pr[=num]: display data process-wise and display first num records where num > 0
– th[=num]: display data thread-wise and display first num records where num > 0
– all[=num]: short for lf,vm,lv,pv,pr,th and display first num records where num > 0
– detailed: display detailed information other than summary report
– abbreviated: Abbreviated mode (transactions)
– collated: Collated mode (transactions)
� New options added to make filemon run in automated offline-mode– A: Enable Automated Offline Mode
– x: Provide the user command to execute, use double quotes if you provide argument to the command
– r: Root String for trace and gennames filenames
IBM Advanced Technical Support
© 2008 IBM Corporation
filemon – Abbreviated Report
# filemon -r trace -O abbreviated
IBM Advanced Technical Support
© 2008 IBM Corporation
filemon - Collated Report
# filemon -r trace -O collated
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Large Page Analysis (6.1 TL02)
� New option 'a' is introduced to enable large page analysis. tprof –a collects profile trace from a representative application run and producesperformance projections for mapping different portions of the application's data space to different page sizes.
� Large Page Analysis uses the information in the trace to project translation buffer performance when mapping any of the following four application memory regions to a different page size:
– static application data (initialized and uninitialized data)
– application heap (dynamically allocated data)
– stack
– application text
� Performance projections are provided for each of the page sizes
supported by the operating system. The first performance
projection is a baseline projection for mapping all four memory
regions to the default 4KB pages.
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Large Page Analysis (Contd.,)
Memory Reference and Allocation counts
Memory References, Allocations summary by process
Memory References by Modeled regions
Performance Projections of Memory Translation Misses by modeled regions for various page sizes
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Data Profiling (6.1 TL02)
� New option 'b', 'B' is introduced to enable basic data profiling in tprof. Basic Data profiling reports data access information.
� Summary section reports access information across kernel data, library data, user global data and stack heap sections for each process.
� When used with –s, -u, -k and –e, tprof data profiling reports most used data structures (exported data symbols) in shared library, binary, kernel and kernel extensions. The –B flag enables the reporting of function names that accessed the data structures
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Data Profiling (Contd.,)
Summary section which reports the % of data access by each process
Summary section which reports the % of data access for each data region in the process
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Data Profiling (Contd.,)
Detail by Data Structure Name and the subroutines that accessed those data structures
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Data Profiling (Contd.,)
Kernel Data Structures Profiling
IBM Advanced Technical Support
© 2008 IBM Corporation
tprof Data profiling (Contd.,)
Shared Library Data Structures Profiling
IBM Advanced Technical Support
© 2008 IBM Corporation101
TrademarksThe following are trademarks of the International B usiness Machines Corporation in the United States, other countries, or both.
The following are trademarks or registered trademar ks of other companies.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes : Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:
*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA, WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®
Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.