Date post: | 21-Apr-2015 |
Category: |
Documents |
Upload: | valentseng |
View: | 198 times |
Download: | 8 times |
Tom Hamilton – America’s Channel Database CSE
Performance Tuning and Databases
1
Agenda
Tuning MethodologyPerformance Analysis
– What causes poor performance– How to identify the causes– How to fix the causes
Performance Analysis ToolsOracle Best PracticesSQL Server Best PracticesNetApp Controller Best PracticesExamplesProtocol Comparisons
2
Introduction
Performance tuning is not a science!
It is an art form!!!!
It is about finding the bottleneck(s)!
3
Tuning Methodology
Check for Known Hardware and Software Problems
Consider the Whole SystemMeasure and Reconfigure by LevelsChange One Thing at a TimePut Tracking and Fallback Procedures in
Place Before You StartDo Not Tune Just for the Sake of TuningRemember the Law of Diminishing Returns
4
HP-UX NFS Performance with OracleDatabase 10g Using NetApp Storage – TR3557
5
Top 5 Performance Bottlenecks
Disk bottleneckFC-AL Loop bottleneckCPU/domain bottleneckNetworking and target card limitsClient side thread parallelism
6
Top 5 Performance Bottlenecks
7
Disk Bottleneck
Shelf Loops
Storage CPU/Domain
Network (FC/Ethernet)
Server Threads
Drive Type and IOPS
8
Diagnosing Disk Throughput Bottlenecks
9
Bottleneck: Disks have data throughput and IOPS limitations
Symptoms: High latency or inability to add more load on a particular volume
Diagnosis: Use statit to monitor disk utilizationDisk bottleneck - If disk utilization is >70% *and* high data transfer rates or transfers per disk
disk ut% xfers ureads--chain-usecs writes--chain-usecs/vol0/plex0/rg0:0b.17 99 297.22 297.22 1.00 20034 0.00 .... . 0b.18 99 292.55 292.55 1.00 19960 0.00 .... . 0b.19 99 294.75 294.75 1.00 20180 0.00 .... .0b.20 99 294.15 294.15 1.00 19792 0.00 .... . 0b.21 99 294.76 294.76 1.00 19632 0.00 .... . 0b.22 99 293.70 293.70 1.00 20341 0.00 .... .
Approx IOPS
SAS/FC Disks 20-260
SATA 20-80
Diagnosing Disk Throughput Bottlenecks
Solutions:Use flexible volumes with large aggregatesAdd more drives to the aggregateRedistribute load on to lightly loaded disksFlashCache
10
Diagnosing Disk Loop Saturation
11
Bottleneck: Disk Loop has throughput limitations Each loop can support up to 2Gbit~180MB/s, 4Gbit =
~360MB/s
Symptoms: High response times to requests on one or more volumes
Inability to add more load to one or more volumes RAID reconstruction / scrub
Diagnosing Disk Loop Saturation
12
Diagnosis: High disk utilization can indicate loop limits
Use statit to monitor disk utilizationLoop
Saturation - If disk utilization is >70% but low disk tput(67 * 31.5 * 4K = 8 MB/s per disk) *and* high data transfer rates on a loop
disk ut% xfers ureads--chain-usecs writes--chain-usecs /vol0/plex0/rg0:
0b.16 0 0.03 0.02 10.00 5250 0.01 10.00 0 0b.17 97 66.91 66.90 31.54 1774 0.01 10.00 0 0b.18 96 67.06 67.05 31.49 1552 0.01 10.00 0 0b.19 95 66.99 66.98 31.54 1472 0.01 10.00 0 0b.20 94 67.05 67.04 31.48 1453 0.01 10.00 0
Diagnosing Disk Loop Saturation
13
Solutions: Add more Disk loops Add Dual-path on single controller or Multipath HA on
CFO(Requires ONTAP 7.1.1 or greater) Redistribute disks in a volume/raid group across loops
Diagnosing CPU/Domain Bottleneck
14
Bottleneck: CPU(s) have more work than they can handle
Symptoms: High latency or Client side sluggishness sysstat reports CPU utilization > 95%
Diagnosis: Use statit to check CPU StatisticsLook for CPU utilization > 95% on 1P, >190% on 2P, >350% on 4P
CPU Statistics 109.921938 time (seconds) 100 % 216.140218 system time 197 %
2.582292 rupt time 2 % 445061 rupts x 6 usec/rupt)
213.55792 non-rupt system time 194 % 3.703658 idle time 3 %
Diagnosing CPU/Domain Bottleneck
Diagnosis: If CPU not high enough, may indicate “domain” bottleneck
Controller processing is split into ‘domains’ such as - RAID, Networking, Storage, etc.
Work associated with a given domain can only run on one processor at any given time
Improper management of filer could result in all work that needs to be done is in one domain
Not a common problem
15
Diagnosing CPU/Domain Bottleneck
16
Diagnosis: Use sysstat –m option or statit to check domain utilization
Look for ‘total’ domain time > 900,000 usec(90%)
cpu0 cpu1 total idle 16952.55 16740.97 33693.52kahuna 456550.84 465228.22 921779.17Network 293882.11 282281.51 576163.53 storage 30605.83 30353.09 60958.92 Exempt 121252.02 121671.95 242923.97raid 70162.66 70080.22 140242.89 target 214.24 204.22 418.46
92% Utilization
Diagnosing CPU Bottleneck
Solutions: Use FlexShare to prioritize workloads Stagger workload to non-peak times Reschedule transient activities such as raid
reconstruct, scrubs etc. Load balance by migrating to other filers If high network traffic, look for
- Misbehaving clients- Bad mounts- Virus scanning activity
Upgrade to a higher performing filer FlashCache?
17
Diagnosing Network and Target Adapter Limits
Gigabit Ethernet 10 Gb Ethernet FC 2Gb target FC 4Gb target
80 MB/s 800 MB/s 180 MB/s 360 MB/s
18
Bottleneck: Network and Target adapters have throughput and IOPS limitations per port
Symptoms: Poor responsiveness to clients on a given network and inability to add more load to filerDiagnosis: Use statit or ifstat to monitor traffic on each port
Look for the port limits, Ex. Gigabit Ethernet Interface Network Interface Statistics (per second)
iface side bytes packets multicasts errors collisions pkt drops e0 recv 595.95 7.65 1.67 0.00 0.00 xmit 679.24 6.46 0.00 0.00 0.00 e9a recv 473754.48 3536.66 2.24 0.00 0.00 xmit 121823577.22 40645.67 0.00 0.00 0.00 e9b recv 471987.25 3523.48 2.24 0.00 0.00 xmit 60596317.35 40493.83 0.00 0.00 0.00 e11a recv 477358.49 3563.57 2.24 0.00 0.00
xmit 61286194.79 40954.82 0.00 0.00 0.00
Diagnosing Network/Target Adapter BottleneckSolutions:
Use Link aggregation (trunking) to increase network bandwidth
Add more adapters or multi-ported adapterRoute traffic through underutilized interfacesUpgrade to 10 Gb network interface or 4 Gb FC
target interface
19
Diagnosing Client Side Thread Parallelism
Problem: Applications see poor throughput but controller doesn’t show any bottleneck
Symptoms: Storage system not fully utilizedHigh I/O wait times at the client
20
Diagnosing Client Side Thread Parallelism
Diagnose: Use sysstat to check CPU, Disk, network utilization
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/s in out read write read write age hit time ty util in out 48% 0 0 0 1141 1 1 24871 36846 0 0 2 100% 61% 23f 30% 0 1141 0 50221 19861 44% 0 0 0 1112 1 1 24937 34109 0 0 2 100% 63% 22f 28% 0 1112 0 44818 20359 51% 0 0 0 1192 1 1 23924 42640 0 0 2 100% 67% 26f 31% 0 1192 0 53965 25354 29% 0 0 0 761 1 1 16554 21744 0 0 2 100% 39% 14 20% 0 761 0 27403 12942 11% 0 0 0 415 1 1 6134 4650 0 0 2 99% 11% 5 7% 0 415 0 7077 4890
Typically single thread of execution from the client (ex. cp and dd)
Solutions: Application Tuning - More threads - Increase transfer sizeTune client or target throttles in case of FCP
21
What is not a Bottleneck?
NVRAM– Provides data integrity in case of controller failure – Only logs transactions before committing to disk– NVRAM is not a write cache
22
Key Takeaways
Performance Monitoring– Adhere to performance monitoring methodology
Focus on latency monitoring Recognize performance trends
Performance Analysis– Identify common performance bottlenecks
Disk bottleneck FC-AL Loop bottleneck CPU/domain bottleneck Networking and target card limits Client side thread parallelism
– Understand possible solutions
23
Performance Tuning Analysis Tools
Sysstat Statit - na_statit.1a.htm statit_explained.pdf Perfstat / Latx SIO Stats https://now.netapp.com/eservice/toolchest Quest Spotlight OnCommand InSight Balance
DBTuna
24
Sizing Tools
Database Sizer– Statspak / AWR– SQL Performance Analyzer– Perfstat
Unified Sizer
25
NetApp Controller Best Practices
Aggregates – TR-3437– Disks (Number, Size, Type)
15k Disks
– Disk loopsIMT
– Network Adapters– FC Adapters
WAFL - WAFL OverviewData OnTap - version
26
Before PAM-IIDisk I/O for 30 Disk SATA Aggregate
31
After PAM-II Installation – TR3832Disk I/O for 30 Disk SATA Aggregate
32
20 Minutes
After Install
Cache 90% Populated
The First 30 minutes of operationPAM-II is delivering I/O the equivalent of: 1 Shelf 15K FC or 5 Shelves of SATA
Deciding Between Flash Cache & SSD
33NetApp Confidential – Limited Use
Flash Cache(Intelligent Read Caching)
Good Fit When …
SSDs in DS4243(Persistent Storage)Good Fit When …
Random read intensive workload Random read intensive workload
Improving average response time is adequate Every read must be fast
Active data is unpredictable or unknown
Active data is known and fits into SSD capacity
Administration free approach desired Active data is known, is dynamic, and ongoing administration is okay
Minimizing system price is important Upside of write acceleration desired
Accelerating an existing HDD configuration is desired
Performance must be consistent across failover events
© 2010 NetApp. All rights reserved. 33NetApp Confidential – Limited Use
Product Information – TR3938
DS4243 SSD Option– 24 x 100GB SSD per shelf
– Requires Data ONTAP 8.0.1+ Both 7-Mode and Cluster-Mode supported
– Shelf setup is exactly the same as with SATA or SAS
Available in full shelf increments only (24 drives):– Add-on: DS4243-SL02-24A-QS-R5
– Configured: DS4243-SL02-24A-R5-C
– Individual drives may be parts ordered (X441A-R5) Formerly X442A-R5
Platforms:– Supported:
FAS/V3160, 3170, 6000 series, 3240, 3270, and 6200 series
– Not supported: FAS/V2020, 2040, 2050, 3140, 3210, and all 3000 series
34NetApp Confidential – Limited Use© 2010 NetApp. All rights reserved. 34NetApp Confidential – Limited Use
Performance - Sequential
Sequential I/O throughput per drive
35NetApp Confidential – Limited Use© 2010 NetApp. All rights reserved. 35NetApp Confidential – Limited Use
Large Seq. Read Large Seq. Writes0
20
40
60
80
100
120
15K rpm FC driveSSD
MB
/se
c p
er
Dri
ve
Storage Performance GuidelinesTR-3437Adequate spindle counts
– Aggregate– Traditional volumes– SATA considerations
SnapMirror / SnapVault considerations– Stagger schedules– Off-peak workload time scheduling– Throttle bandwidth
Multipath HALUN Alignment
36
HP-UX NFS Performance with OracleDatabase 10g Using NetApp Storage – TR3557
37
HP-UX NFS Performance with OracleDatabase 10g Using NetApp Storage – TR3557
38
HP-UX Server Tuninghttp://docs.hp.com/en/5992-4222ENW/5992-4222ENW.pdf Tune-N-Toolshttp://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=Tune-N-Tools HP-UX TCP/IP Performance White Paperhttp://docs.hp.com/en/11890/perf-whitepaper-tcpip-v1_1.pdf
HP-UX NFS Performance with OracleDatabase 10g Using NetApp Storage – TR3557
HP-UX Kernel ParametersOracle Initialization ParametersNFS Mount Optionshttp://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb7518
HP-UX NFS Kernel ParametersHP-UX Patch Bundles
39
AIX DB2 Performance Protocol Comparison on NetApp, TR-3581
40
AIX Oracle Performance Protocol Comparison on NetApp, TR-3871
41
Oracle 10g Performance Protocol Comparison on SUN Solaris 10 – TR3496
42
Oracle 10g Performance Protocol Comparison on SUN Solaris 10 – TR3496
43
Oracle 11g R2 Performance Protocol Comparison on RHEL 5 – TR3932
44
Summary
NetApp ControllerHostNetworkDatabase Settings
FIND THE BOTTLENECK!!
45