Performance Tuning and Databases

Tom Hamilton – America’s Channel Database CSE

Performance Tuning and Databases

1

Agenda

Tuning MethodologyPerformance Analysis

– What causes poor performance– How to identify the causes– How to fix the causes

Performance Analysis ToolsOracle Best PracticesSQL Server Best PracticesNetApp Controller Best PracticesExamplesProtocol Comparisons

2

Introduction

Performance tuning is not a science!

It is an art form!!!!

It is about finding the bottleneck(s)!

3

Tuning Methodology

Check for Known Hardware and Software Problems

Consider the Whole SystemMeasure and Reconfigure by LevelsChange One Thing at a TimePut Tracking and Fallback Procedures in

Place Before You StartDo Not Tune Just for the Sake of TuningRemember the Law of Diminishing Returns

4

HP-UX NFS Performance with OracleDatabase 10g Using NetApp Storage – TR3557

5

Top 5 Performance Bottlenecks

Disk bottleneckFC-AL Loop bottleneckCPU/domain bottleneckNetworking and target card limitsClient side thread parallelism

6

Top 5 Performance Bottlenecks

7

Disk Bottleneck

Shelf Loops

Storage CPU/Domain

Network (FC/Ethernet)

Server Threads

Drive Type and IOPS

8

Diagnosing Disk Throughput Bottlenecks

9

Bottleneck: Disks have data throughput and IOPS limitations

Symptoms: High latency or inability to add more load on a particular volume

Diagnosis: Use statit to monitor disk utilizationDisk bottleneck - If disk utilization is >70% *and* high data transfer rates or transfers per disk

disk ut% xfers ureads--chain-usecs writes--chain-usecs/vol0/plex0/rg0:0b.17 99 297.22 297.22 1.00 20034 0.00 .... . 0b.18 99 292.55 292.55 1.00 19960 0.00 .... . 0b.19 99 294.75 294.75 1.00 20180 0.00 .... .0b.20 99 294.15 294.15 1.00 19792 0.00 .... . 0b.21 99 294.76 294.76 1.00 19632 0.00 .... . 0b.22 99 293.70 293.70 1.00 20341 0.00 .... .

Approx IOPS

SAS/FC Disks 20-260

SATA 20-80

Diagnosing Disk Throughput Bottlenecks

Solutions:Use flexible volumes with large aggregatesAdd more drives to the aggregateRedistribute load on to lightly loaded disksFlashCache

10

Diagnosing Disk Loop Saturation

11

Bottleneck: Disk Loop has throughput limitations Each loop can support up to 2Gbit~180MB/s, 4Gbit =

~360MB/s

Symptoms: High response times to requests on one or more volumes

Inability to add more load to one or more volumes RAID reconstruction / scrub


12

Diagnosis: High disk utilization can indicate loop limits

Use statit to monitor disk utilizationLoop

Saturation - If disk utilization is >70% but low disk tput(67 * 31.5 * 4K = 8 MB/s per disk) *and* high data transfer rates on a loop

disk ut% xfers ureads--chain-usecs writes--chain-usecs /vol0/plex0/rg0:

0b.16 0 0.03 0.02 10.00 5250 0.01 10.00 0 0b.17 97 66.91 66.90 31.54 1774 0.01 10.00 0 0b.18 96 67.06 67.05 31.49 1552 0.01 10.00 0 0b.19 95 66.99 66.98 31.54 1472 0.01 10.00 0 0b.20 94 67.05 67.04 31.48 1453 0.01 10.00 0


13

Solutions: Add more Disk loops Add Dual-path on single controller or Multipath HA on

CFO(Requires ONTAP 7.1.1 or greater) Redistribute disks in a volume/raid group across loops

Diagnosing CPU/Domain Bottleneck

14

Bottleneck: CPU(s) have more work than they can handle

Symptoms: High latency or Client side sluggishness sysstat reports CPU utilization > 95%

Diagnosis: Use statit to check CPU StatisticsLook for CPU utilization > 95% on 1P, >190% on 2P, >350% on 4P

CPU Statistics 109.921938 time (seconds) 100 % 216.140218 system time 197 %

2.582292 rupt time 2 % 445061 rupts x 6 usec/rupt)

213.55792 non-rupt system time 194 % 3.703658 idle time 3 %


Diagnosis: If CPU not high enough, may indicate “domain” bottleneck

Controller processing is split into ‘domains’ such as - RAID, Networking, Storage, etc.

Work associated with a given domain can only run on one processor at any given time

Improper management of filer could result in all work that needs to be done is in one domain

Not a common problem

15


16

Diagnosis: Use sysstat –m option or statit to check domain utilization

Look for ‘total’ domain time > 900,000 usec(90%)

cpu0 cpu1 total idle 16952.55 16740.97 33693.52kahuna 456550.84 465228.22 921779.17Network 293882.11 282281.51 576163.53 storage 30605.83 30353.09 60958.92 Exempt 121252.02 121671.95 242923.97raid 70162.66 70080.22 140242.89 target 214.24 204.22 418.46

92% Utilization

Diagnosing CPU Bottleneck

Solutions: Use FlexShare to prioritize workloads Stagger workload to non-peak times Reschedule transient activities such as raid

reconstruct, scrubs etc. Load balance by migrating to other filers If high network traffic, look for

- Misbehaving clients- Bad mounts- Virus scanning activity

Upgrade to a higher performing filer FlashCache?

17

Diagnosing Network and Target Adapter Limits

Gigabit Ethernet 10 Gb Ethernet FC 2Gb target FC 4Gb target

80 MB/s 800 MB/s 180 MB/s 360 MB/s

18

Bottleneck: Network and Target adapters have throughput and IOPS limitations per port

Symptoms: Poor responsiveness to clients on a given network and inability to add more load to filerDiagnosis: Use statit or ifstat to monitor traffic on each port

Look for the port limits, Ex. Gigabit Ethernet Interface Network Interface Statistics (per second)

iface side bytes packets multicasts errors collisions pkt drops e0 recv 595.95 7.65 1.67 0.00 0.00 xmit 679.24 6.46 0.00 0.00 0.00 e9a recv 473754.48 3536.66 2.24 0.00 0.00 xmit 121823577.22 40645.67 0.00 0.00 0.00 e9b recv 471987.25 3523.48 2.24 0.00 0.00 xmit 60596317.35 40493.83 0.00 0.00 0.00 e11a recv 477358.49 3563.57 2.24 0.00 0.00

xmit 61286194.79 40954.82 0.00 0.00 0.00

Diagnosing Network/Target Adapter BottleneckSolutions:

Use Link aggregation (trunking) to increase network bandwidth

Add more adapters or multi-ported adapterRoute traffic through underutilized interfacesUpgrade to 10 Gb network interface or 4 Gb FC

target interface

19

Diagnosing Client Side Thread Parallelism

Problem: Applications see poor throughput but controller doesn’t show any bottleneck

Symptoms: Storage system not fully utilizedHigh I/O wait times at the client

20

Diagnosing Client Side Thread Parallelism

Diagnose: Use sysstat to check CPU, Disk, network utilization

CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/s in out read write read write age hit time ty util in out 48% 0 0 0 1141 1 1 24871 36846 0 0 2 100% 61% 23f 30% 0 1141 0 50221 19861 44% 0 0 0 1112 1 1 24937 34109 0 0 2 100% 63% 22f 28% 0 1112 0 44818 20359 51% 0 0 0 1192 1 1 23924 42640 0 0 2 100% 67% 26f 31% 0 1192 0 53965 25354 29% 0 0 0 761 1 1 16554 21744 0 0 2 100% 39% 14 20% 0 761 0 27403 12942 11% 0 0 0 415 1 1 6134 4650 0 0 2 99% 11% 5 7% 0 415 0 7077 4890

Typically single thread of execution from the client (ex. cp and dd)

Solutions: Application Tuning - More threads - Increase transfer sizeTune client or target throttles in case of FCP

21

What is not a Bottleneck?

NVRAM– Provides data integrity in case of controller failure – Only logs transactions before committing to disk– NVRAM is not a write cache

22

Key Takeaways

Performance Monitoring– Adhere to performance monitoring methodology

Focus on latency monitoring Recognize performance trends

Performance Analysis– Identify common performance bottlenecks

Disk bottleneck FC-AL Loop bottleneck CPU/domain bottleneck Networking and target card limits Client side thread parallelism

– Understand possible solutions

23

Performance Tuning Analysis Tools

Sysstat Statit - na_statit.1a.htm statit_explained.pdf Perfstat / Latx SIO Stats https://now.netapp.com/eservice/toolchest Quest Spotlight OnCommand InSight Balance

DBTuna

24

https://now.netapp.com/eservice/toolchest

http://10.26.121.214/

http://10.26.121.214/

http://10.26.121.214/

http://10.26.121.214/

Sizing Tools

Database Sizer– Statspak / AWR– SQL Performance Analyzer– Perfstat

Unified Sizer

25

NetApp Controller Best Practices

Aggregates – TR-3437– Disks (Number, Size, Type)

15k Disks

– Disk loopsIMT

– Network Adapters– FC Adapters

WAFL - WAFL OverviewData OnTap - version

26

http://media.netapp.com/documents/tr-3437.pdf

Before PAM-IIDisk I/O for 30 Disk SATA Aggregate

31

After PAM-II Installation – TR3832Disk I/O for 30 Disk SATA Aggregate

32

20 Minutes

After Install

Cache 90% Populated

The First 30 minutes of operationPAM-II is delivering I/O the equivalent of: 1 Shelf 15K FC or 5 Shelves of SATA


Deciding Between Flash Cache & SSD

33NetApp Confidential – Limited Use

Flash Cache(Intelligent Read Caching)

Good Fit When …

SSDs in DS4243(Persistent Storage)Good Fit When …

Random read intensive workload Random read intensive workload

Improving average response time is adequate Every read must be fast

Active data is unpredictable or unknown

Active data is known and fits into SSD capacity

Administration free approach desired Active data is known, is dynamic, and ongoing administration is okay

Minimizing system price is important Upside of write acceleration desired

Accelerating an existing HDD configuration is desired

Performance must be consistent across failover events

© 2010 NetApp. All rights reserved. 33NetApp Confidential – Limited Use

http://wikid.netapp.com/uploads/f/f1/Virgo_hba.png

http://wikid.netapp.com/uploads/a/ac/DS4243_front.jpg

Product Information – TR3938

DS4243 SSD Option– 24 x 100GB SSD per shelf

– Requires Data ONTAP 8.0.1+ Both 7-Mode and Cluster-Mode supported

– Shelf setup is exactly the same as with SATA or SAS

Available in full shelf increments only (24 drives):– Add-on: DS4243-SL02-24A-QS-R5

– Configured: DS4243-SL02-24A-R5-C

– Individual drives may be parts ordered (X441A-R5) Formerly X442A-R5

Platforms:– Supported:

FAS/V3160, 3170, 6000 series, 3240, 3270, and 6200 series

– Not supported: FAS/V2020, 2040, 2050, 3140, 3210, and all 3000 series

34NetApp Confidential – Limited Use© 2010 NetApp. All rights reserved. 34NetApp Confidential – Limited Use


Performance - Sequential

Sequential I/O throughput per drive

35NetApp Confidential – Limited Use© 2010 NetApp. All rights reserved. 35NetApp Confidential – Limited Use

Large Seq. Read Large Seq. Writes0

20

40

60

80

100

120

15K rpm FC driveSSD

MB

/se

c p

er

Dri

ve

Storage Performance GuidelinesTR-3437Adequate spindle counts

– Aggregate– Traditional volumes– SATA considerations

SnapMirror / SnapVault considerations– Stagger schedules– Off-peak workload time scheduling– Throttle bandwidth

Multipath HALUN Alignment

36

http://now.netapp.com/NOW/knowledge/docs/hba/fcp_esx/fcpesxhu31/html/software/install/troubl10.htm


37


38

HP-UX Server Tuninghttp://docs.hp.com/en/5992-4222ENW/5992-4222ENW.pdf Tune-N-Toolshttp://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=Tune-N-Tools HP-UX TCP/IP Performance White Paperhttp://docs.hp.com/en/11890/perf-whitepaper-tcpip-v1_1.pdf

http://docs.hp.com/en/5992-4222ENW/5992-4222ENW.pdf

http://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=Tune-N-Tools

http://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=Tune-N-Tools

http://docs.hp.com/en/11890/perf-whitepaper-tcpip-v1_1.pdf


HP-UX Kernel ParametersOracle Initialization ParametersNFS Mount Optionshttp://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb7518

HP-UX NFS Kernel ParametersHP-UX Patch Bundles

39

http://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb7518

AIX DB2 Performance Protocol Comparison on NetApp, TR-3581

40

AIX Oracle Performance Protocol Comparison on NetApp, TR-3871

41

Oracle 10g Performance Protocol Comparison on SUN Solaris 10 – TR3496

42

Oracle 10g Performance Protocol Comparison on SUN Solaris 10 – TR3496

43

Oracle 11g R2 Performance Protocol Comparison on RHEL 5 – TR3932

44

Summary

NetApp ControllerHostNetworkDatabase Settings

FIND THE BOTTLENECK!!

45

Date post:	21-Apr-2015
Category:	Documents
Upload:	valentseng
View:	198 times
Download:	8 times

Performance Tuning and Databases

Documents