Perfmon And Profiler 101

Post on 25-May-2015

2,866 views 5 download

Tags:

description

Learn to use Performance Monitor and SQL Server Profiler.

transcript

© 2008 Quest Software, Inc. ALL RIGHTS RESERVED.

Perfmon and Profiler 101

About Me: Brent Ozar• SQL Server Expert for

Quest Software• Former SQL DBA• Managed >80tb SAN,

VMware• Dot-com-crash

experience• Specializes in

performance tuning

Today’s Agenda• The Civic & Godzilla• Metrics, Trace,

Mitigations• Taking Before and

After Pictures• Helpful Tools• Sample Scenarios• Resources and Q&A

If You Don’t Need to Go Fast…

Photo Licensed With Creative Commons From http://flickr.com/photos/stevekeys/2755142278/

But The Faster You Want To Go

The More You Have To Measure

Windows “Check Engine” Light

And If You Wanna Go Fast:

Two Approaches to Detection• Exceptions Monitoring:

– Check Engine Light– Reactive Actions

• Proactive Monitoring: – Detailed Gauges– Preventative Actions

Metrics-Trace-Mitigation Process

Where Do We Start?

Capture Metrics With Perfmon• Performance Monitor, aka Perfmon• Ships with all Windows versions• Polls any server from your desktop• Pulls performance metrics• Writes them to a file• Requires some OS permissions• Does not include alerts or analytics

13

Memory Counters• Memory – Available Mbytes• Paging File - % Usage• SQLServer:Buffer Manager –

– Buffer cache hit ratio– Page life expectancy

• SQLServer:Memory Manager – Memory Grants Pending

Storage Metrics: Physical Disk• Avg. Disk Queue Length• Avg. Disk sec/Read• Avg. Disk sec/Write• Disk Reads/sec• Disk Writes/sec• % Disk Time

CPU & Other Metrics• Processor - % Processor Time• System – Processor Queue Length• SQLServer:General Statistics –

User Connections

17

The Raw Output: CSV Files

18

Adding Analytical Formulas

19

That’s a Lot of Zeroes!

20

Sorting High to Low

What To Look For, In Order• System – Processor Queue Length• Memory – Available Mbytes• Lock pages in memory!

What To Look For Next• Disk metrics on the page file drive• Disk metrics on the log file drive• Disk metrics on the data file drive• Disk metrics on the TempDB drive

Got Everything on One Drive?• Narrow it down with the DMV

sys.dm_io_virtual_file_stats

Capture Queries with a Trace

Columns to Capture

What’s Going On• Text Data• DatabaseID and/or

DatabaseName• Login Name• Host Name• Application Name

What The Impact Was• CPU• Reads• Writes• Duration• Start Time• End Time

Profiler’s Results: A Trace Table

Order By Duration Descending

Casting and Grouping

Correlate Metrics & Trace• Show a cause and effect relationship• Fields to mentally “join” on:

– Date/Time ranges– CPU– Reads/Writes– Duration

Metrics-Trace-Mitigation Process

If Our Servers Were Houses…

Before

After

Consistent and Repeatable• 100 users accessing the web site• Closing a typical financial period• 10 users running a report• Importing 100,000 records for a nightly

ETL process• Think scripts, think load generation tool• Capture statistics during test run

When To Take A Picture• Adding new hardware• Installing a SQL

Server service pack• Changing storage

configurations• New application

versions• Every quarter

Before

After

Plus Monitoring For…• Things break• Populations change

over time• Budgeting• Need to enforce

standards• We’re not the only

ones working on the house

Before

After

Save Perfmon & Profiler Data• Central file share• Even better: in a database• Name by server, by date• Revisit every budget season• Use for new hire training

Tool: Performance Dashboard

Tool: Data Mining

Table Analysis Tools For The Cloud

Detect Categories of Load

Works for Profiler Results Too

Tool: ClearTrace

Cleans Up Queries

Sample Problem #1• Metrics tell us:

– Very high disk queue lengths on data drive

• Trace tells us:– Report queries doing

table scans w/o indexes– Many scheduled reports

run simultaneously

Ways We Can Mitigate It• Add covering indexes• Modify existing indexes• Add hard drives to the data file array• Add memory to cache scanned tables• Run reports serially, not all at once

Sample Problem #2• Metrics tell us:

– Page file drive queue lengths average >20

– Page file use averages >1%– Available memory averages less

than 250mb

• Trace tells us:– No unusual queries

Memory Configuration

Ways We Can Mitigate It• Add memory and enable AWE/PAE• Add memory and upgrade to 64-bit• Move the app to its own server• Reduce SQL’s min/max memory sizes

Sample Problem #3• Metric looks OK, but

every 15 minutes:– Long drive queues on the

log file drive– Page life expectancy

drops near zero– Network traffic jumps

• Trace tells us:– Transaction log backups

are running

Ways We Can Mitigate It• Stop doing log backups• Put the databases in simple mode• Add drives to the transaction log array• Throttle the transaction log backups

Sample Problem #4• Metrics tell us:

– CPU average is high – Disk, memory look OK

• Trace tells us:– Queries are using

cursors– Operating on

individual records, not sets

How We Can Mitigate It• Buy really fast processors• Spend a lot on licensing• Change cursor to set-based query

Wrapping Things Up• Double-check the event log first• Don’t get overwhelmed: focus with the

Metric – Trace – Mitigation process• Show a clear cause and effect• Use pro tools to get an edge

Resources On The Web• My blog about Perfmon:

www.BrentOzar.com/perfmon• Excel Table Analysis Tools for the Cloud:

www.SQLServerDataMining.com/cloud• ClearTrace:

http://www.cleardata.biz/cleartrace/• SQL Server community:

SQLServerPedia.com