Date post: | 17-Jun-2015 |
Category: |
Technology |
Upload: | styxyx |
View: | 367 times |
Download: | 0 times |
IBM Advanced Technical Support - Americas
© 2008 IBM Corporation04/13/23
AIX Performance: Configuration & Tuning for Oracle
Vijay [email protected] - Oracle Solutions Team
IBM Advanced Technical Support - Americas
2 © 2008 IBM Corporation 04/13/23
Legal informationThe information in this presentation is provided by IBM on an "AS IS" basis without any warranty, guarantee or assurance of any kind. IBM also does not provide any warranty, guarantee or assurance that the information in this paper is free from errors or omissions. Information is believed to be accurate as of the date of publication. You should check with the appropriate vendor to obtain current product information.
Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.
IBM, ^, , RS6000, System p, AIX, AIX 5L, GPFS, and Enterprise Storage Server (ESS) are trademarks or registered trademarks of the International Business Machines Corporation.
Oracle, Oracle9i and Oracle10g are trademarks or registered trademarks of Oracle Corporation.
All other products or company names are used for identification purposes only, and may be trademarks of their respective owners.
IBM Advanced Technical Support - Americas
3 © 2008 IBM Corporation 04/13/23
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
Agenda
IBM Advanced Technical Support - Americas
4 © 2008 IBM Corporation 04/13/23
The suggestions presented here are considered to be basic configuration “starting points” for general Oracle workloads
Your workloads may vary
Ongoing performance monitoring and tuning is recommended to ensure that the configuration is optimal for the particular workload characteristics
AIX Configuration Best Practices for Oracle
IBM Advanced Technical Support - Americas
5 © 2008 IBM Corporation 04/13/23
Performance Overview – Tuning Methodology
CPU Memory I/ONetwork
Predominant Bottleneck
• Understand the external view of system performanceThe external view of system performance is the observable event that is causing someone to say the system is performing poorly. Typically, (1) end-user response time, (2) application (or task) response time or (3) throughput. Should not use system metrics to judge improvement.
• Performance only improves when the predominant bottleneck is fixed
Fixing a secondary bottleneck will not improve performance and typically results in overloading an already overloaded predominant bottleneck.
• Monitor Performance after a change – Tuning is an iterative process
Monitoring is required after making a change for two reasons (1) Fixing the predominant bottleneck typically uncovers another bottleneck, and (2) Not all changes yield a positive results. If possible you should have a “repeatable” test to so change can be accurately evaluated.
• End-User Response time is the elapsed time between when a user submits a request and receives a response. • Application Response time is the elapsed required for one or more jobs to complete. Historically, these jobs have been called batch jobs. • Throughput is the amount of work that can be accomplished per unit time. This metric is typically expressed in terms of transaction per minute.
Iterative Tuning Process
Stress System (i.e., Tune at Peak workload)
Monitor Sub-Systems
Identify Predominant Bottleneck
Tune Bottleneck
Repeat
IBM Advanced Technical Support - Americas
6 © 2008 IBM Corporation 04/13/23
Performance Monitoring and Tuning Tools
CPU MemoryI/O
SubsystemNetwork
Processes & Threads
Status Commands
vmstat, topas, iostat, ps, mpstat, lparstat, sar, time/timex, emstat/alstat
vmstat, topas, ps, lsps, ipcs
vmstat, topas, iostat, lvmstat, lsps, lsattr/lsdev, lspv/lsvg/lslv
netstat, topas, atmstat, entstat, tokstat, fddistat, nfsstat, ifconfig
ps, pstat, topas, emstat/alstat
Monitor
Commands
netpmon svmon, netpmon, filemon
fileplace, filemon
netpmon, tcpdump
svmon, truss, kdb, dbx, gprof, kdb, fuser, prof
Trace Level Commands
tprof, curt, splat, trace, trcrpt
trace,trcrpt trace, trcrpt iptrace, ipreport, trace, trcrpt
truss, pprof curt, splat, trace, trcrpt
Tuning tools
schedo, fdpr, bindprocessor, bindintcpu, nice/renice, setpri
vmo, rmss,fdpr, chps/mkps
ioo, lvmo, chdev, migratepv,chlv, reorgvg
no, chdev,ifconfig
nfso,chdev, fdpr
IBM Advanced Technical Support - Americas
7 © 2008 IBM Corporation 04/13/23
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
Agenda
Advanced Technical Support – System p
© 2008 IBM Corporation8 04/13/23
AIX Memory Management Overview
The role of Virtual Memory Manager (VMM) is to provide the capability for programs to address more memory locations than are actually available in physical memory.
On AIX this is accomplished using segments that are partitioned into fixed sizes called “pages”.
– A segment is 256M
– default page size 4K
– POWER 4+ and POWER5 can define large pages, which are 16M
The 32-bit or 64-bit address translates into a 52-bit or 80-bit virtual address
– 32-bit system : 4-bit segment register that contains a 24-bit segment id, and 28-bit offset.
• 24-bit segment id + 28-bit offset = 52-bit VA
– 64-bit system: 32-bit segment register that contains a 52-bit segment id, and 28-bit offset.
• 52-bit segment id + 28-bit offset = 80-bit VA
The VMM maintains a list of free frames that can be used to retrieve pages that need to be brought into memory.
– The VMM replenishes the free list by removing some of the current pages from real memory (i.e., steal memory).
– The process of moving data between memory and disk is called “paging”.
The VMM uses a Page Replacement Algorithm (implemented in the lrud kernel threads) to select pages that will be removed from memory.
Advanced Technical Support – System p
© 2008 IBM Corporation9 04/13/23
Virtual Memory Space – 64 Bits 36-bits selects Segment Register 28-bits offset within Segment 64-bit Address
.
.
.
Virtual Memory
1 Trillion Terabytes or 1 Yotta byte
Segments IDs
0
Each Segment Register contains a 52-bit Segment ID
Kernel Segment
Page Space Disk Map
Kernel Heap
256 Mbyte Segment
52-bit Segment Id + 28-bit offset = 80-bit Virtual Address
Segment is divided into 4096 byte chunks called pages
Each Segment can have a maximum of
65536 pages
28-bit offset – to access a specific location in the
segment
228 = 256M
Advanced Technical Support – System p
© 2008 IBM Corporation10 04/13/23
Memory Tuning Overview
Virtual Memory
(General)
Large Pages
(Pinned Memory 1)
Memory:
minfree
maxfree
lru_file_repage
lru_poll_interval
v_pinshm
lgpg_regions
lgpg_size
JFSEnhanced JFS
(JFS2)
maxperm
strict_maxperm
maxclient
strict_maxclient
NAME CUR DEF BOOT MIN MAX UNIT TYPE--------------------------------------------------------------------------------lru_file_repage 1 1 1 0 1 boolean Dlru_poll_interval 0 0 0 0 60000 milliseconds Dmaxclient% 80 80 80 1 100 % memory Dmaxfree 1088 1088 1088 8 200K 4KB pages Dmaxperm% 80 80 80 1 100 % memory Dminfree 960 960 960 8 200K 4KB pages Dstrict_maxclient 1 1 1 0 1 boolean Dstrict_maxperm 0 0 0 0 1 boolean Dminperm% 20 20 20 1 100 % memory D
vmo –p –o <parameter name>=<new value>
-p flags updates /etc/tunables/nextboot
IBM Advanced Technical Support - Americas
11 © 2008 IBM Corporation 04/13/23
The AIX “vmo” command provides for the display and/or update of several parameters which influence the way AIX manages physical memory– The “-a” option displays current parameter settings
vmo –a
– The “-o” option is used to change parameter values
vmo –o minfree=1440
– The “-p” option is used to make changes persist across a reboot
vmo –p –o minfree=1440
Virtual Memory Manager (VMM) Tuning
A number of the default “vmo” settings are not optimized for
database workloads and should be modified for Oracle environments
IBM Advanced Technical Support - Americas
12 © 2008 IBM Corporation 04/13/23
VMM Tuning
Suggested Combination
– maxperm%=maxclient%=<High Percentage> – minperm% = <Low Percentage>– strict_maxperm=0 – strict_maxclient=1– lru_file_repage=0 – lru_poll_interval=10
The file cache will be allowed to grow; however, when the VMM needs memory it will steal only file pages. Why? Because we’ve set lru_file_repage=0.
What is <High Percentage> – If possible, set so maxclient% is always greater than numclient% (vmstat –v)
• Why? Maxclient is a hard limit; therefore, lrud will not run What is <Low Percentage>
– Set so that numperm (vmstat –v) is always greater than minperm% • Why? If numperm drops below minperm then lru_file_repage is set to 1 and you
will steal computational pages
Advanced Technical Support – System p
© 2008 IBM Corporation13 04/13/23
VMM Tuning Combination Summary – Goal is to prevent paging of computational memory.
Recommended Method:
lru_file_repage = 0
strict_maxperm = 0
strict_maxclient = 1
maxperm% = maxclient% = High Percentage
minperm% = Low Percentage
lru_poll_interval=10
Classic Method*:
lru_file_repage = 1
strict_maxperm = 0
strict_maxclient = 0
maxperm% = maxclient% = 20% (or small number)
minperm% = 5
lru_poll_interval=10
* This method is appropriate for system that don’t have ‘lru_file_repage’ tunable.
Calculated Method:
lru_file_repage = 0
strict_maxperm = 0
strict_maxclient = 1
maxperm% = maxclient% = 1 - % Computational + 20%
lru_poll_interval=10
Where,
%Computational = max. AVM / Real Memory Frames
Avoid:
strict_maxperm = 1 and strict_maxclient = 0
strict_maxperm = strict_maxclient = 0 & lru_file_repage = 0
IBM Advanced Technical Support - Americas
14 04/13/23 © 2008 IBM Corporation
0%
20%
40%
60%
80%
100%
Time
Phy
sica
l Mem
ory
numperm% comp% Free% maxperm%maxfree minfree minperm%
Virtual Memory Management (VMM) ThresholdsStart stealing pages when free memory below minfree
Stop stealing pages when free memory above maxfree
When numperm% > maxperm%, steal only file system pages
When minperm% < numperm% < maxperm%, steal file system or computation pages, depending on repage rate
When numperm% < minperm%, steal both file system and computational pages
IBM Advanced Technical Support - Americas
15 © 2008 IBM Corporation 04/13/23
VMM Page Stealing Thresholds
The following define thresholds for the VMM page stealing process (lrud):
minfree– Set minfree = 120 x # logical CPUs / # Memory pools– Consider increasing if vmstat “fre” column frequently approaches zero or if
“vmstat –s” shows significant “free frame waits”
maxfree – Set maxfree = minfree + (MAX(maxpgahead, j2_maxPageReadAhead) x #
logical CPUs)
Example:
For a 6-way LPAR with SMT enabled, maxpgahead=8 and j2_maxPageReadAhead=8:– minfree = 360 = 120 x 6 x 2 / 4– maxfree = 1536 = 1440 + (max(8,8) x 6 x 2)
vmo –o minfree=1440 –o maxfree=1536 -p
Advanced Technical Support – System p
© 2008 IBM Corporation16 04/13/23
AIX 5.3/6.1 – minfree and maxfree changes minfree and maxfree on AIX 5.3/6.1 are now applied to each memory pool.
total free list = minfree * # of memory pools In earlier releases of AIX (5.2 and 5.1), minfree was divided by the number of memory pools
so that the total free list (determined by adding minfree for *each* memory pool) equaled the vmo/vmtune value of minfree.
AIX Level minfree mempools LRUD starts when 51/52 1024 4 free_list =< 1024 53 1024 4 free_list =< (4 * 1024)
Initial Setting AIX 5.3/6.1 Initial Setting AIX 5.2
minfree = max( 960, lcpus * 120) ----------------------- # of mempools
maxfree = minfree + (Max Read Ahead * lcpus) ---------------------- # of mempools
minfree = max( 960, lcpus * 120)
maxfree = minfree + (Max Read Ahead * lcpus)
Where,
Max Read Ahead = max( maxpgahead, j2_maxPageReadAhead)
IBM Advanced Technical Support - Americas
17 © 2008 IBM Corporation 04/13/23
AIX Paging Space
Allocate Paging Space: Configure Server/LPAR with enough physical memory to satisfy memory requirements
With AIX demand paging, paging space does not have to be large Provides safety net to prevent system crashes when memory overcommitted.
Generally, keep within internal drive or high performing SAN storage
Monitor paging activity: vmstat -s
sar -r
nmon
Resolve paging issues: Reduce file system cache size (MAXPERM, MAXCLIENT)
Reduce Oracle SGA or PGA (9i or later) size
Add physical memory
Do not over commit real memory!
IBM Advanced Technical Support - Americas
18 © 2008 IBM Corporation 04/13/23
AIX 5.3/6.1 Multiple Page Size Support AIX 5.3 5300-04 introduces two new page sizes:
– 64K
– 16M (large pages)
Requires p5+ hardware
Requires p5 System Release 240, Service Level 202 microcode
16MB support requires Version 5 Release 2 of the Hardware Management Console (HMC) machine code
User/Application must request preferred page size
– 64K pages appear very promising, since they do not need to be configured/reserved in advance
– Will require Oracle code changes to explicitly support (10.2.0.4)
– If preferred size not available, the largest available smaller size will be used• Current Oracle versions should end up using 64KB pages if 16mb pages
not configured?
IBM Advanced Technical Support - Americas
19 © 2008 IBM Corporation 04/13/23
Large Page Support (optional)
Pinning shared memory
AIX Parameters• vmo –p –o v_pinshm = 1• Leave maxpin% at the default of 80% unless the SGA exceeds 77% of real memory
– Vmo –p –o maxpin%=[(total mem-SGA size)*100/total mem] + 3 Oracle Parameters
• LOCK_SGA = TRUE
Enabling Large Page Support vmo –r –o lgpg_size = 16777216 –o lgpg_regions=(SGA size / 16 MB)
Allowing Oracle to use Large Pages chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle
Using Monitoring Tools svmon –G
svmon –P
Oracle metalink note# 372157.1
IBM Advanced Technical Support - Americas
20 © 2008 IBM Corporation 04/13/23
Determining SGA size
SGA Memory Summary for DB: test01 Instance: test01 Snaps: 1046 -1047
SGA regions Size in Bytes
------------------------------ ----------------
Database Buffers 16,928,210,944
Fixed Size 768,448
Redo Buffers 2,371,584
Variable Size 1,241,513,984
----------------
sum 18,172,864,960
lgpg_regions = 18,172,864,960 / 16,777,216 = 1084 (rounded up)
Advanced Technical Support – System p
© 2008 IBM Corporation21 04/13/23
Tuning and Improving System Performance Adjust the VMM Tuning Parameters
– Key parameters listed on word document
Implement VMM related Mount Options
– DIO / CIO
– Release behind or read and/or write
Reduce Application Memory Requirements
Memory Model
– %Computational < 70% - Large Memory Model – Goal is to adjust tuning parameters to prevent paging• Multiple Memory pools • Page Space smaller than Memory • Must Tune VMM key parameters
– %Computational > 70% - Small Memory Model – Goal is to make paging as efficient as possible • Add multiple page spaces on different spindles • Make all pages space the same size to ensure round-robin scheduling • PS = 1.5 computational requirements • Turn off DEFPS • Memory Load Control
Add additional Memory
IBM Advanced Technical Support - Americas
22 © 2008 IBM Corporation 04/13/23
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
Agenda
IBM Advanced Technical Support - Americas
23 © 2008 IBM Corporation 04/13/23
Application memory area caches data to avoid IO
NFS caches file attributes NFS has a cached filesystem for NFS clients
The AIX IO stack
JFS and JFS2 cache use extra system RAM JFS uses persistent pages for cache JFS2 uses client pages for cache
Queues exist for both adapters and disksAdapter device drivers use DMA for IODisk subsystems have read and write cache
Disks have memory to store commands/data
Cache
Disk
Device Driver (s)
Application
LVM
VMM
LVM
Local FSJFS/JFS2
Remote FS NFS
Disk Subsystem (optional)
Raw
LVs
Raw
disks
Logical File System
Write Cache - ack sent back to application
IBM Advanced Technical Support - Americas
24 © 2008 IBM Corporation 04/13/23
Asynchronous I/O AIX parameters (smit aio)
minservers = 10 * # cpus maxservers = (10 * # disks) / # cpus maxreqs = a multiple of 4096 > 4 * #disks * queue_depth “enable” at system restartTypical settings: minservers=100, maxservers=200, maxreqs=16384
Oracle parameters (init.ora) disk_asynch_io = TRUE filesystemio_options = {ASYNCH | SETALL} db_writer_processes = n (normally left at default, 1) db_writer_io_slaves = n (don’t use – implements AIO simulation)
Monitor usage:• Watch for Oracle alert log or trace file messages:
– Warning “lio_listo returned EAGAIN”
• AIX Monitoring– “pstat –a | grep aios”– Use “-A” and “-t” options for NMON
Note: FASTPATH, which uses async IO. AIO servers method uses the process based IO, whereas FASTPATH method uses Kernel based (interrupt based) is much better. Make sure it is enabled by using the following command:– lsattr -El aio0 and look for the value "fastpath", which should be enabled
IBM Advanced Technical Support - Americas
25 © 2008 IBM Corporation 04/13/23
AIX Filesystems Journaled File System (JFS)
Better for lots of small file creates & deletes– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.
– Direct I/O (DIO) mount/open option no caching on reads
Enhanced JFS (JFS2)
Better for large files/filesystems– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.
– Direct I/O (DIO) mount/open option no caching on reads
– Concurrent I/O (CIO) mount/open option DIO, with write serialization disabled• Use for Oracle .dbf, control files and online redo logs only!!!
GPFS
Clustered filesystem – the IBM filesystem for RAC
– Non-cached, non-blocking I/Os (similiar to JFS2 CIO) for all Oracle files
GPFS and JFS2 with CIO offer similar performance as Raw Devices
IBM Advanced Technical Support - Americas
26 © 2008 IBM Corporation 04/13/23
Cached vs. non-Cached (Direct) I/O
Oracle 9i Oracle 10g
JFS Set filesystemio_options=SETALL-or-
Use “dio” mount option
Set filesystemio_options=SETALL-or-
Use “dio” mount option
JFS2 Use “cio” mount option Set filesystemio_options=SETALL-or-
Use “cio” mount option
File System caching tends to benefit heavily sequential workloads with low write content. To enable caching for JFS/JFS2:
Use default filesystem mount options
Set Oracle filesystemio_options=ASYNCH
DIO tends to benefit heavily random access workloads and CIO tends to benefit heavy update workloads. To disable JFS, JFS2 caching, see the following table:
IBM Advanced Technical Support - Americas
27 © 2008 IBM Corporation 04/13/23
CIO Demotion and Filesystem Block Size
Data Base Files (DBF)
If db_block_size = 2048 set agblksize=2048
If db_block_size >= 4096 set agblksize=4096
Redo Log Files
Set agblksize=512 and use CIO or DIO
IBM Advanced Technical Support - Americas
28 © 2008 IBM Corporation 04/13/23
I/O Tuning (ioo)
READ-AHEAD (Only applicable to JFS/JFS2 with caching enabled)
MINPGAHEAD (JFS) or j2_minPageReadAhead (JFS2)– Default: 2– Starting value: MAX(2,DB_BLOCK_SIZE / 4096)
MAXPGAHEAD (JFS) or j2_maxPageReadAhead (JFS2)– Default: 8 (JFS), 128 (JFS2)– Set equal to (or multiple of) size of largest Oracle I/O request
• DB_BLOCK_SIZE * DB_FILE_MULTI_BLOCK_READ_COUNT
Number of buffer structures per filesystem:
NUMFSBUFS: – Default: 196, Starting Value: 568
j2_nBufferPerPagerDevice (j2_dynamicBufferPreallocation replaces)
– Default: 512, Starting Value: 2048
Monitor with “vmstat –v”
IBM Advanced Technical Support - Americas
29 © 2008 IBM Corporation 04/13/23
Data Layout for Optimal I/O Performance
Stripe and mirror everything (SAME) approach:
Goal is to balance I/O activity across all disks, loops, adapters, etc...
Avoid/Eliminate I/O hotspots
Manual file-by-file data placement is time consuming, resource intensive and iterative
Use RAID-5 or RAID-10 to create striped LUNs (hdisks)
Create AIX Volume Group(s) (VG) w/ LUNs from multiple arrays, striping on the front end as well for maximum distribution
Physical Partition Spreading (mklv –e x) –or-
Large Grained LVM striping (>= 1MB stripe size)
http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319
IBM Advanced Technical Support - Americas
30 © 2008 IBM Corporation 04/13/23
Data Layout cont’d…
Stripe using Logical Volume (LV) or Physical Partition (PP) striping
LV Striping– Oracle recommends stripe width of a multiple of
• Db_block_size * db_file_multiblock_read_count• Usually around 1 MB
– Valid LV Strip sizes:• AIX 5.2: 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1 MB• AIX 5.3: AIX 5.2 Stripe sizes + 2M, 4M, 16 MB, 32M, 64M, 128M
– Use AIX Logical Volume 0 offset (9i Release 2 or later) • Use Scalable Volume Groups (VGs), or use “mklv –T O” with Big VGs• Requires AIX APAR IY36656 and Oracle patch (bug 2620053)
PP Striping– Use minimum Physical Partition (PP) size (mklv -t, -s parms)
• Spread AIX Logical Volume (LV) PPs across multiple hdisks in VG (mklv –e x)
IBM Advanced Technical Support - Americas
31 © 2008 IBM Corporation 04/13/23
Tuning and Improving System Performance Adjust the key IOO Tuning Parameters
Adjust device specific tuning Parameters
Other I/O tuning Options
– DIO / CIO
– Release behind or read and/or write
– IO Pacing
– Write Behind
Improve the data layout
Add additional hardware resources
IBM Advanced Technical Support - Americas
32 © 2008 IBM Corporation 04/13/23
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
Agenda
IBM Advanced Technical Support - Americas
33 © 2008 IBM Corporation 04/13/23
Network Options (no) Parameters
• Set sb_max >= 1 MB (1048576) • Set tcp_sendspace >= 262144• Set tcp_recvspace >= 262144• Set rfc1323=1
IBM Advanced Technical Support - Americas
34 © 2008 IBM Corporation 04/13/23
Additional Network (no) Parameters for RAC:
Set udp_sendspace = db_block_size * db_file_multiblock_read_count
(not less than 65536)
Set udp_recvspace = 4 * udp_sendspace
– Must be < sb_max
Increase if buffer overflows occur
Examples:
no -a |grep udp_sendspace
no –o -p udp_sendspace=65536
netstat -s |grep "socket buffer overflows"
IBM Advanced Technical Support - Americas
35 © 2008 IBM Corporation 04/13/23
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
Agenda
IBM Advanced Technical Support - Americas
36 © 2008 IBM Corporation 04/13/23
Miscellaneous parameters
User Limits (smit chuser)– Soft FILE size = -1 (Unlimited)– Soft CPU time = -1 (Unlimited)– Soft DATA segment = -1 (Unlimited)– Soft STACK size -1 (Unlimited)– /etc/security/limits
Maximum number of PROCESSES allowed per user (smit chgsys)– maxuproc >= 2048
Environment variables:– AIXTHREAD_SCOPE=S
IBM Advanced Technical Support - Americas
37 © 2008 IBM Corporation 04/13/23
DLPAR & Oracle
CPU
Oracle 9i– Oracle CPU_COUNT does not recognize change in # logical cpus– AIX scheduler can still use the added CPUs
Oracle 10g
Oracle CPU_COUNT is dynamically updated for change in # logical cpus
Memory
Oracle 9i or 10g– SGA can be dynamically resized, but has an upper bound of SGA_MAX_SIZE.
• SGA_TARGET (10g)• DB_CACHE_SIZE, SHARED_POOL_SIZE., etc.
– PGA_AGGREGATE_TARGET can be dynamically resized
SGA_TARGET and PGA_AGGREGATE_TARGET are not hard limits
IBM Advanced Technical Support - Americas
38 © 2008 IBM Corporation 04/13/23
Micro-Partitioning technology
Partitioning options – Micro-partitions: Up to 254*– Dynamic LPARs: Up to 32*– Combination of both
Configured via the HMC
Number of logical processors– Minimum/maximum
Entitled capacity– In units of 1/100 of a CPU– Minimum 1/10 of a CPU
Variable weight– % share (priority) of
surplus capacity
Capped or uncapped partitions
Micro-partitions
Pool of 6 CPUs
Lin
ux
i5/O
S V
5R3*
*
AIX
5L
V5.
3
AIX
5L
V5.
3
Lin
ux
Entitledcapacity
Hypervisor
Min
Max
*on p5-590 and p5-595** on p5-570, p5-590, and p5-595
AIX
5L
V
5.2
AIX
5L
V
5.3
DynamicLPARs
WholeProcessors
Micro-Partitioning technology allows each processor to be subdivided into as
many as 10 “virtual servers”
Note: Micro-partitions are optional.
IBM Advanced Technical Support - Americas
39 © 2008 IBM Corporation 04/13/23
Shared Processor Logical Partitions – Terminology
LPAR w/o SMTAIX 5.3
LPAR w/o SMTAIX 5.3
LPAR w/SMTAIX 5.3
LPAR w/SMTAIX 5.3
LPARLPAR
Shared Processor PoolCapacity of 6 Processing Units
Shared Processor Logical Partition (splpar) key terms that will be discussed:
Physical Processors (PP) – An 8-way p5 590. For this configuration one MCM houses 4 POWER5 chip and each POWER5 chip has two processor cores. With SMT enable each processor core can simultaneous execute two instruction threads.
Shared Processor Pool – 6 processors have been allocated to the shared processor pool and 2 processors have been allocated to a dedicated partition.
Virtual Processors (VP) – The operating system views the virtual processors as a “physical processor”.
Logical Processors – With SMT enabled each VP is viewed by the operating system has having two logical processors.
Process Capacity specification for splpars - Each splpar has the entitled processing capability, which is defined via a number of partition configuration parameters.
Now, let’s discuss processor capacity specification in more detail.
POWER5 Chip Processor Core The four POWER5 chips are packaged on a Multi-Chip Module (MCM).
Virtual Processors Logical Processors
IBM Advanced Technical Support - Americas
40 © 2008 IBM Corporation 04/13/23
Capped Shared Processor LPAR
Maximum Processor Capacity
Entitled Processor CapacityProcessorCapacityUtilization LPAR Capacity Utilization
Pool Idle Capacity Available
Time
minimum processor capacity
ceded capacity
utilized capacity
IBM Advanced Technical Support - Americas
41 © 2008 IBM Corporation 04/13/23
Uncapped Shared Processor LPAR
Maximum Processor Capacity
ProcessorCapacityUtilization
Pool Idle Capacity Available
Time
Entitled Processor Capacity
minimum processor capacity
Utilized Capacity
ceded capacity
IBM Advanced Technical Support - Americas
42 © 2008 IBM Corporation 04/13/23
Simultaneous Multithreading (SMT) & Oracle
CPU Total AIX52 3/9/2004
0
10
20
30
40
50
60
70
80
90
100
12:3
4
12:3
6
12:3
8
12:4
0
12:4
2
12:4
412
:46
12:4
8
12:5
0
12:5
2
12:5
4
12:5
6
12:5
8
13:0
0
13:0
2
13:0
4
13:0
6
13:0
813
:10
13:1
2
13:1
4
13:1
6
13:1
8
13:2
0
13:2
2
13:2
4
13:2
6
13:2
8
13:3
0
13:3
213
:34
13:3
6
13:3
8
13:4
0
13:4
2
13:4
4
User% Sys% Wait%
CPU Total AIX53 10/9/2004
0
10
20
30
40
50
60
70
80
90
100
10:4
5
10:4
7
10:4
9
10:5
1
10:5
3
10:5
5
10:5
7
10:5
9
11:0
1
11:0
3
11:0
5
11:0
7
11:0
9
11:1
1
11:1
3
11:1
5
11:1
7
11:1
9
11:2
1
11:2
3
11:2
5
11:2
7
11:2
9
11:3
1
11:3
3
11:3
5
11:3
7
11:3
9
11:4
1
11:4
3
11:4
5
11:4
7
11:4
9User% Sys% Wait%
Processes AIX53 10/9/2004
0
5
10
15
20
25
10:4
5
10:4
7
10:4
9
10:5
1
10:5
3
10:5
5
10:5
7
10:5
9
11:0
1
11:0
3
11:0
5
11:0
7
11:0
9
11:1
1
11:1
3
11:1
5
11:1
7
11:1
9
11:2
1
11:2
3
11:2
5
11:2
7
11:2
9
11:3
1
11:3
3
11:3
5
11:3
7
11:3
9
11:4
1
11:4
3
11:4
5
11:4
7
11:4
9
RunQueue Swap-in
Processes AIX52 3/9/2004
0
5
10
15
20
25
30
35
40
45
12:3
4
12:3
6
12:3
8
12:4
0
12:4
2
12:4
4
12:4
6
12:4
8
12:5
0
12:5
2
12:5
4
12:5
6
12:5
8
13:0
0
13:0
2
13:0
4
13:0
6
13:0
8
13:1
0
13:1
2
13:1
4
13:1
6
13:1
8
13:2
0
13:2
2
13:2
4
13:2
6
13:2
8
13:3
0
13:3
2
13:3
4
13:3
6
13:3
8
13:4
0
13:4
2
13:4
4
RunQueue Swap-in
Without SMT:
With SMT:
IBM Advanced Technical Support - Americas
43 © 2008 IBM Corporation 04/13/23
Performance Monitoring and Tuning Tools
CPU MemoryI/O
SubsystemNetwork
Processes & Threads
Status Commands
vmstat, topas, iostat, ps, mpstat, lparstat, sar, time/timex, emstat/alstat
vmstat, topas, ps, lsps, ipcs
vmstat, topas, iostat, lvmstat, lsps, lsattr/lsdev, lspv/lsvg/lslv
netstat, topas, atmstat, entstat, tokstat, fddistat, nfsstat, ifconfig
ps, pstat, topas, emstat/alstat
Monitor
Commands
netpmon svmon, netpmon, filemon
fileplace, filemon
netpmon, tcpdump
svmon, truss, kdb, dbx, gprof, kdb, fuser, prof
Trace Level Commands
tprof, curt, splat, trace, trcrpt
trace,trcrpt trace, trcrpt iptrace, ipreport, trace, trcrpt
truss, pprof curt, splat, trace, trcrpt
Tuning tools
schedo, fdpr, bindprocessor, bindintcpu, nice/renice, setpri
vmo, rmss,fdpr, chps/mkps
ioo, lvmo, chdev, migratepv,chlv, reorgvg
no, chdev,ifconfig
nfso,chdev
IBM Advanced Technical Support - Americas
44 © 2008 IBM Corporation 04/13/23
Reference Material:Oracle Techical Documentation
http://technet.oracle.com
Oracle Support
http://metalink.oracle.com (requires support license)
Check metalink note ID 282036.1
IBM Redbooks on Oracle
http://www.redbooks.ibm.com
Advanced Technical Support (Techdocs)
http://www.ibm.com/support/techdocs
http://w3.ibm.com/support/techdocs (IBM Internal)
GPFS Documentation
http://publib.boulder.ibm.com/clresctr/library/gpfs_faqs.html
AIX Documentation
http://www.ibm.com/servers/eserver/pseries/library/
IBM Advanced Technical Support - Americas
45 © 2008 IBM Corporation 04/13/23
Q&A
IBM Advanced Technical Support - Americas
46 © 2008 IBM Corporation 04/13/23
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400, DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/30, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM
The following are trademarks or registered trademarks of other companies
Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development CorporationJava and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countriesLINUX is a registered trademark of Linux TorvaldsUNIX is a registered trademark of The Open Group in the United States and other countries.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.Intel is a registered trademark of Intel Corporation* All other products may be trademarks or registered trademarks of their respective companies.
NOTES:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.
The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
Trademarks