History of database monitoring

Post on 24-Jan-2017

627 views 0 download

transcript

Midnight January 28, 1986 Lives are on the line

History of Database MonitoringMy own experiences

http://kylehailey.comKylelf@gmail.com

History … 1988 6 1989 1990 1991 1992 7 1993 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 1993 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 1993 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 1993 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 1993 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits 1993 Patrol vs M2 1994 1995 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

M2

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 1999 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 Statspack 8.1.6 top waits 1999 Spotlight 2000 2001 9 2002 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 Statspack 8.1.6 top waits 1999 Spotlight 2000 Statspack 9iR2 top events 2001 9 2002 design OEM 10 2003 10 2005 10.2 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 Statspack 8.1.6 top waits 1999 Spotlight 2000 Statspack 9iR2 top events 2001 9 2002 design OEM 10 2003 10 OEM 10 2005 10.2 Top Activity 2006 2008 200X 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 Statspack 8.1.6 top waits 1999 Spotlight 2000 Statspack 9iR2 top events 2001 9 2002 design OEM 10 2003 10 OEM 10 2005 10.2 Top Activity 2006 Ashmon 2008 DB Optimizer 200X YaaMs Delphix 2016

History … 1988 6 1989 Utlbstat/utlestat 1990 joined 1991 1992 7 Waits Waits 1993 Patrol vs M2 1994 Tcl/Tk Europe Car 1995 Tcl/Tk waits 1996 1997 8 1998 Statspack 8.1.6 top waits 1999 Spotlight 2000 Statspack 9iR2 top events 2001 9 2002 design OEM 10 2003 10 OEM 10 2005 10.2 Top Activity 2006 Ashmon 2008 DB Optimizer 200X YaaMs Delphix 2016 Amazon

Computer Performance

1. Interactive• Opaque: Static, idiosyncrasies

Vs• Fun : Graphic , informative

2. Sampling• Counters and ratios

Vs• Waits and sampling

3. Graphics• Spaghetti on the wall

Vs

• Intelligence in the interface

Computer Performance

1. Interactive• Opaque: Static, idiosyncrasies

Vs• Fun : Graphic , informative

2. Sampling• Counters and ratios

Vs• Waits and sampling

3. Graphics• Spaghetti on the wall

Vs

• Intelligence in the interface

Computer Performance

1. Interactive• Opaque: Static, idiosyncrasies

Vs• Fun : Graphic , informative

2. Sampling• Counters and ratios

Vs• Waits and sampling

3. Graphics• Spaghetti on the wall

Vs

• Intelligence in the interface

Theme:

Let’s make performance tuning a video game

The journey of simplicity: Designing an Interface

The journey of simplicity

1. Seems simple

“When you start looking at a problem and it seems really simple, you don’t really understand the complexity of the problem.” – Steve Jobs

The journey of simplicity

1. Seems simple2. Realize it’s complex

The journey of simplicity

1. Seems simple2. It’s complex 3. Create complex solution

“Then you get into the problem, and you see that it’s really complicated, and you come up with all these convoluted solutions. That’s sort of the middle, and that’s where most people stop.” – Steve Jobs

The journey of simplicity

1. Seems simple2. It’s complex 3. Complex solution4. Complex solution is bad

The journey of simplicity

1. Seems simple2. It’s complex 3. Complex solution4. Complex is bad5. Simple powerful is hard

“But the really great person will keep on going and find the key, the underlying principle of the problem — and come up with an elegant, really beautiful solution that works.” – Steve Jobs

Simple can be harderthan complex.

You have to work hard to get your thinking clean to make it simple.

Prototype & Iterate (Cary Millsap)

Thought: images are running the world

Computers (&DBs)- can be “black boxes”

How do you get in?

How do you get in?

Wordstar

Text Friction

You log in, then what?

1984 OK, UI can change everything

Computer performance

What’s inside?

OR

How do you make good tools ?

Prototype and Iterate

Make it a video game !

Utlbstat/Utlestat …

•Intrusive•Overwhelming•Ratios & Averages

rem $Header: utlbstat.sql 26-feb-96.19:20:51 gpongrac Exp $ bstat.sql Rem Copyright (c) 1988, 1996 by Oracle CorporationRem NAMEREM UTLBSTAT.SQLRem MODIFIEDRem khailey 03/15/99 - add current user fields to stats$date, bug 594266Rem jloaiza 10/14/95 - add tablespace sizeRem jloaiza 09/19/95 - add waitstatRem jloaiza 09/04/95 - add per second and background waitsRem drady 09/09/93 - merge changes from branch 1.1.312.2Rem drady 03/22/93 - merge changes from branch 1.1.312.1 Rem drady 08/24/93 - bug 173918Rem drady 03/04/93 - fix bug 152986 Rem glumpkin 11/16/92 - Renamed from UTLSTATB.SQL Rem glumpkin 10/19/92 - Renamed from BSTAT.SQL Rem jloaiza 01/07/92 - rework for version 7Rem mroberts 08/16/91 - fix view for v7 Rem rlim 04/29/91 - change char to varchar2 Rem Laursen 01/01/91 - V6 to V7 mergeRem Loaiza 04/04/89 - fix run dates to minutes instead of monthsRem Martin 02/22/89 - CreationRem Jloaiza 02/23/89 - changed table names, added dates, added param dumpRem

insert into stats$begin_event select * from v$system_event;insert into stats$begin_roll select * from v$rollstat;insert into stats$begin_file select * from stats$file_view;insert into stats$begin_dc select * from v$rowcache;insert into stats$begin_stats select * from v$sysstat;insert into stats$begin_lib select * from v$librarycache;insert into stats$begin_latch select * from v$latch;

1993 First Monitors - Patrol

1993 Patrol on Dec 8400

1993

Roger Saunders

M2•Light Weight•

Roger Saunders

M2•Light Weight•Direct Memory Access•

Roger Saunders

M2•Light Weight•Direct Memory Access•Sampling

1994 Light weight graphic : Tcl/Tk + M2

1995 Tcl/Tk dynamic

Monitor Everything Approach

1995 Monitor Everything Approach

Monitor Everything Approach

Monitor Everything Approach

Wait Events Became the Focus

Graphs created dynamically

Easy to destroy

Wait Events Became the Focus

Graphs created dynamically

Easy to destroy

Improvements

•Scale Graphs Equally

Eliminate:

•Background waits•Idle Waits •Extraneous waits•Wait counts

log_file_switch_completion : increase log file size

http://oraperf.sourceforge.net/seminar/ex3_test_2.html

Centi-seconds

free buffer waits : increase db_block_buffers

log file sync : log file -> raw device

db file sequential read : increase db_block_buffers

write complete wait free buffer waits : increase db_block_buffersdb file sequential reads

Final

Tuning catproc.sql on version 7

Compulsive Tuning Disorder

Missing CPU usage to put into perspective

1998 8.1.6 Statspack

•Kyle wanted M2 & Graphics•Connie wanted statspack•boss : neither !

•Connie did statspack •I left

1998 8.1.6 Statspack

•Kyle wanted M2 & Graphics•Connie wanted statspack•boss : neither !

•Connie did statspack •I left

1998 8.1.6 Statspack

•Kyle wanted M2 & Graphics•Connie wanted statspack•boss : neither !

•Connie did statspack •I left

1998 8.1.6 Statspack

•Kyle wanted M2 & Graphics•Connie wanted statspack•boss : neither !

•Connie did statspack •I left

•In the mean time ….

1

2

3

4

5

6

7

8

9

10

11

1999 Spotlight

Without Reading a Manual

• Handspring’s site crashed Nov 25, 1999• Biggest Sales Day of the Year• Library cache latch contention• No DBAs • Downloaded Quest’s Spotlight• Installed and Identified problem with Minutes• Solution in code fix

Spotlight – Stacked Waits !

Ratio of waits to CPU

• How do you get CPU?• CPU stats?

Centi-seconds

Session countCPU stackedNo wait groups

EM v10 Proposed Perf Page (v4.1)

Avoid Scrolling and Hiding Data

OEM

Sampling vs Counters

•Given a wait bottleneck• Which User• Which SQL• What object / file / block

•Not feasible with counters•Easy and cheap with Sampling : Multi-dimensional

Before ASH:Before ASH:• Sessions v$sesstat, v$session_event

• # sessions x (# wait events + statistics)• Example (150 x (800+200) = 150,000 )

• SQL v$sql• Could be 10000s• Takes out latches that compete with other sql executions

• Objects V$segstat • Could be 1000s of objects

• Files v$filestat

Expensive !

Multi-DimensionalMulti-Dimensional

Top ConsumersSessionUserObjectModule.ActionProgram ServiceClientWait

XTop Resources• CPU• Waits• Event

(800*)• I/O• File• Block

• Time

Top SQLSQL IDPlanChild#

X

And Aggregated over any time Period

Multi-dimensionalMulti-dimensional

25 34 36 38 45 63 65 87 25 34 36 38 45 63 65 87

F1qcyh20550cfF1qcyh20550cf

fj6gjgsshtxyxfj6gjgsshtxyx

0cjsxw5ndqdbc0cjsxw5ndqdbc

8t8as9usk11qw8t8as9usk11qw

dr1rkrznhh95bdr1rkrznhh95b

10dkqv3kr8xa510dkqv3kr8xa5

38zhkf4jdyff438zhkf4jdyff4

298wmz1kxjs1m298wmz1kxjs1m

CPUCPUEnq: TX – row lock contention

Enq: TX – row lock contention

SQL*Net break/reset to client

SQL*Net break/reset to client

db file scattered read

db file scattered read

db file sequential read

db file sequential read

IO

Application

SQLSQL

Sessions

WaitsWaits

Service Service

Scott SystemUserUser

Program Program

Sys

Sqlplus Toad

GL OE

PackageProcedurePlanChild #

Sampling vs Waits

Statistic Lag Time

Copyright 2006 Kyle Hailey

CountersCounters

SamplesSamples

Slight LagsSlight Lags

If you are not tuning for time, you are wasting time

Max CPU

(yard stick)

Top Activity Top Activity

SQLSQLSessionsSessions

LOADLOAD

Cambrian Explosion :YaaMs

Confio Ignite

Dell Foglight

Lab128

D.side

W-ASH

Ashviewer

emlite

ASHmon

MyOra

Mumbai

Precise

Lighty

Example Problem

How so you communicated quantive data ?

Midnight January 28, 1986 Lives are on the line

Thanks to Edward Tufte

Night before the Flight

Jan 27,1986

Estimated launch temperature 29º

13 Pages Faxed

13 Pages Faxed

3 different types of names

Damage (in overwhelming detail) but No Temperatures

13 Pages Faxed

13 Pages Faxed

Missing Data for 5 erosion damage flights

Blow by Damage

Test engines fired horizontally

13 Pages Faxed

Shows “blow by”, not more important “erosion”

Damage at hottest and coldest launches* (of the flights shown)

Next day’s flight

13 Pages Faxed

Predict Temperature

Recommendation

55 65 7560 70 80

1

Original Engineering data

2

3

““damages atdamages atthe hottest the hottest and coldest and coldest Temperature” Temperature”

Would you launch?

Congressional Hearings Evidence

No Damage LegendDamage hard to read

Congressional Hearings Evidence

Temperature correlation difficult

55 65 7560 70 80

1

Original Data

2

3

Clearer

1. Y-Axis amount of damage (not number of damage)55 65 7560 70 80

4

8

12

1. Y-Axis amount of damage (not number of damage)2. Include successes *

55 65 7560 70 80

4

8

12

Clearer

* Only external temperatures were known not the temperature of the solid rocket boosters

Be accurate enough

1. Y-Axis amount of damage (not number of damage)2. Include successes3. Mark Differences

55 65 7560 70 80

4

8

12

Clearer

1. Y-Axis amount of damage (not number of damage)2. Include successes3. Mark Differences4. Normalize same temp

55 65 7560 70 80

4

8

12

Clearer

1. Y-Axis amount of damage (not number of damage)2. Include successes3. Mark Differences4. Normalize same temp

55 65 7560 70 80

4

8

12

Clearer

Damage on every flight below 65

No damage on every flight above 75

1. Y-Axis amount of damage (not number of damage)2. Include successes3. Mark Differences4. Normalize same temp

55 65 7560 70 80

4

8

12

Clearer

Known World

1. Y-Axis amount of damage (not number of damage)2. Include successes3. Mark Differences4. Normalize same temp5. Scale known vs unknown

55 65 7560 70 80

4

8

12

4

8

12

30 40 5035 45

XX

Clearer

Difficult

NASA Engineers FailCongressional Investigators FailData Visualization is Difficult

But …

Lack of Clarity can be devastating

Visualization can be powerful

“If I can't picture it, I can't understand it”

Anscombe's QuartetI II III IV

x y x y x y x y10 8.04 10 9.14 10 7.46 8 6.588 6.95 8 8.14 8 6.77 8 5.76

13 7.58 13 8.74 13 12.74 8 7.719 8.81 9 8.77 9 7.11 8 8.84

11 8.33 11 9.26 11 7.81 8 8.4714 9.96 14 8.1 14 8.84 8 7.046 7.24 6 6.13 6 6.08 8 5.254 4.26 4 3.1 4 5.39 19 12.5

12 10.84 12 9.13 12 8.15 8 5.567 4.82 7 7.26 7 6.42 8 7.915 5.68 5 4.74 5 5.73 8 6.89

Average 9 7.5 9 7.5 9 7.5 9 7.5Standard Deviation 3.31 2.03 3.31 2.03 3.31 2.03 3.31 2.03Linear Regression 1.33 1.33 1.33 1.33

- Albert Einstein- Albert Einstein

Graphics for Anscombe’s Quartet

Counties in US

> 3000 Counties > 50 pages

“The humans … are exceptionally good at parsing visual information.” Knowledge representation in cognitive science. Westbury, C. & Wilensky, U. (1998)

Visualizations can also obfuscate

Pretty Picture

Spaghetti at the wall

Spaghetti at the wall II

Amazon Cloudwatch

Imagine Trying to Drive your Car

And is updated once and hourAnd is updated once and hour

Or would you like it to Or would you like it to look …look …

Would you want your dashboard to look like :Would you want your dashboard to look like :

If you are not tuning for time, you are wasting time

When Developers say When Developers say

The Database is slowThe Database is slow

AAS ~= 0AAS ~= 0

Do You Want?

Engineering Data?Engineering Data?

Pretty PicturesPretty Pictures

Do You Want?

Clean and Clear Clean and Clear

? ? ? ? ? ? ? ? ? ?? ?

Do You Want?

Summary• Textual statistics – difficult to parse• Pretty pictures misleading• Goal clear graphics powerful

Kylelf@gmail.comhttp://kylehailey.com

Summary• Textual statistics – difficult to parse• Pretty pictures misleading• Goal clear graphics powerful

Simple can be harderthan complex.

You have to work hard to get your thinking clean to make it simple.

Prototype & Iterate

Kylelf@gmail.comhttp://kylehailey.com

•END