Sql Performance Tuning For Developers

Post on 07-Nov-2014

2,212 views 2 download

Tags:

description

 

transcript

SQL SERVER 2005/2008

Performance tuning for

the developer

Michelle Gutzait

gutzait@pythian.com

michelle.gutzait@gmail.com

Blog: http://michelle-gutzait.spaces.live.com/default.aspx

2

Whoami?

SQL Server Team Lead @ www.pythian.com

24/7 Remote DBA services

I live in Montreal

gutzait@pythian.com

michelle.gutzait@gmail.com

Blog: http://michelle-gutzait.spaces.live.com/default.aspx

33

Agenda – Part I

General concepts of performance and

tuning• Performance bottlenecks

• Optimization tools

• Table and index

• The data page

• the optimizer

• Execution plans

44

Agenda – Part II

Development performance Tips• T-SQL commands

• Views

• Cursors

• User-defined functions

• Working with temporary tables and table variables

• Stored Procedures and functions

• Data Manipulation

• Transactions

• Dynamic SQL

• Triggers

• Locks

• Table and database design issues

55

“The fact that I can

does not mean that I

should !”

Kimberly Tripp (?)

66

Always treat your

code as if it‟s

running:

Frequently

On large amount of data

In a very busy environment

7

The goal

Min response time and Max

throughput

Reduce network traffic, disk I/O

and CPU time

Start optimizing as early as

possible as it will be harder

later.

8

Design and Tuning Tradeoffs

9

Network Communication

Database Applications

Presentation Layer

Application Logic

Client OS

Network

Network

OS/IO Subsystem

SQL Server

Operating

System and

Hardware

Client

Side

Server

Side

Client/Server Tuning Levels

10

The Typical Performance

Pyramid

Application / Query / Database Design

Operating Environment

HardwareBeware: In certain

environments this pyramid

may be upside down!

11

Application & performance

1212

The result

“Ugly” code may perform

much better

13

Performance bottlenecks - tools

Windows Performance Monitor

SQL Server Profiler

SQL Server Management Studio

151515

Performance bottlenecks - tools

3-rd party tool

1616

Let’s remember few basic

concepts…

1818

Rows On A Page

Page Header

Row A

Row C

Row B

ABC

Data rows

Row Offset Table2 bytes each

96

bytes

8,096

bytes

19

The Data Row

Header Fixed data NB VB Variable data

Null

Block

Variable

Block

4 bytes

Data

20202020

Data access methods

2121

Index

• Helps locate data more rapidly

2222

Index Structure: Nonclustered Index

2323

Structure of Clustered Index

242424

Covering Index

252525

Index

26

Heap table

• A table with no clustered index

RID is built from file:page:row

2727

Table Scan

Will usually

be faster

using a

clustered

index

2828

Parsing

Normalization

Sequence Tree

Is SQL?

Trivial Plan

Optimization

Syntatic

Transformation

SQL

Optimization

Execution Plan

T-SQL

YesIs Cheap

Enough?

SARG Selection

Index Selection

JOIN Selection

NO

Caching

Memory Allocation

Execution

Execution

Plan – cost

based

optimization

Optimizer hints

View optimizer info

29292929

Few concepts in the Execution

Plan algorithm…

303030

Search ARGuments

SARG Always isolate Columns

SARG NOT SARG

where MonthlySalary > 600000/12 where MonthlySalary * 12 > 600000

where ID in (select ID from vw_Person) where dbo.fu_IsPerson(ID) = 1234

where firstname like 'm%' where SUBSTRING(firstname,1,1) = 'm’

SARG:

= BETWEEN, >, <, LIKE ‟x%‟, EXISTS

Not SARGABLE:

LIKE „%x‟, NOT LIKE, NOT EXISTS, FUNCTION(column)

AND creates a single SARG

OR creates multiple SARG‟s

31

Table, column and index statistics

Step

AL

CA

IL

IL

OR

TX

WA

WY

Sales…

…………………………………………………………

…………………………………………………………

state

ALAKCACACACTILILILILIL

MTORORPATXTXWAWAWAWIWY

Step #

0

1

2

3

4

5

6

7

statblob

ALCAILIL

ORTXWAWY

……………………

……………………

sys.sysobjvalues (internal)

323232

Update statistics - “Rules of thumb”

Use auto create and auto update statistics

5% of the table changes

Still bad query:

Create statistics

Update statistics with FULLSCAN

Use multi-column statistics when queries have multi-

column conditions

Use AUTO_UPDATE_STATISTICS_ASYNC

database option

No stats for temporary objects and functions

333333

Join selection

JOIN Types

NESTED LOOP

MERGE

HASH

Factors:

• JOIN strategies

• JOIN order

• SARG

• Indexes

3434

HASH Joins are used when no useful index

exists on one or both of the JOIN inputs.

These can be converted to MERGE or LOOP

joins through effective indexing.

Joins - Optimization tip

3535

Index intersection

SELECT *

FROM authors

WHERE au_fname = ‘Fox' AND au_lname

= ‘Mulder'

36363636

Tuning with indexes…

Index

37

Index tips

MORE indexes – for queries, LESS indexes – for updates

More indexes – more possibilities for optimizer

Having a CLUSTERED INDEX is almost always a good

idea…

Sort operations: TOP, DISTINCT, GROUP BY, ORDER BY

and JOIN; WHERE

As narrow as possible to avoid excessive I/O

Use integer values rather than character values

Values with low selectivity

covering index - faster than a clustered index

38

Index tips 2

CLUSTERED index key in all non-clustered indexes (otherwise RID is used)

Frequently updated column and clustered index

Drop costly UNUSED indexes

High volume inserts – incremental Clustered index

Surrogate integer primary key (identity ?)

Clustered index for random modifications and index bottleneck

CLUSTERED index on non-unique columns – 16 bytes added (uniqueidentifier)

39

Creating index before rare heavy operations

When Changing/dropping CLUSTERED index, drop all

NON-CLUSTERED indexes first.

Don‟t forget to recreate them later

Indexes are almost always in cache, therefore are faster

Column referenced by OR and no index on the column

table scan.

PRIMARY KEY and UNIQUE CONSTRAINTS create

indexes

Foreign Keys do NOT create indexes

Index tips 3

40

Wide and fewer indexes are sometimes better

than many and narrower indexes

INCLUDE columns for covering index

Indexes are used to reduce the number of rows

fetched, otherwise they are not necessary

If TEMPDB resides on different physical disk,

you may use SORT_IN_TEMPDB

Index tips 4

4343434343

Data modifications…

46

Page Header

Row A

Row C

Row B

ABC

Row E

46

Data modifications

4646

Data rows

Row Offset Table2 bytes each

96

bytes

8,096

bytes

Row D

Differed update –forwarded

In a heap – rows are forwarded leaving old address in place

474747474747

Index fragmentation

48

INDEXES - fragmentationDBCC SHOWCONTIG ('Orders‘)

DBCC SHOWCONTIG scanning 'Orders' table...Table: 'Orders' (21575115); index ID: 1, database ID: 6TABLE level scan performed.- Pages Scanned................................: 20- Extents Scanned..............................: 5- Extent Switches..............................: 4- Avg. Pages per Extent........................: 4.0- Scan Density [Best Count:Actual Count].......: 60.00% [3:5]- Logical Scan Fragmentation ..................: 0.00%- Extent Scan Fragmentation ...................: 40.00%- Avg. Bytes Free per Page.....................: 146.5- Avg. Page Density (full).....................: 98.19

SELECT *

FROM sys.dm_db_index_physical_stats

(DatabaseID, TableId, IndexId, NULL, Mode)

49

Indexed Views

SELECT t1.Col2, t2.Col3,

count(*) as Cnt

FROM Table_1 t1

INNER JOIN Table_2 t2

ON t1.Col1 = t2.Col1

GROUP BY t1.Col2, t2.Col3

Possible bottleneck

51

Questions

End of Part I…

52

Agenda – Part II

Development performance Tips• T-SQL commands

• Views

• Cursors

• User-defined functions

• Working with temporary tables and table variables

• Stored Procedures and functions

• Data Manipulation

• Transactions

• Dynamic SQL

• Triggers

• Locks

• Table and database design issues

535353

Returning/processing too much

data…

5454

Disk

Database Applications

Presentation Layer

Application Logic

Client OS

Network

Network

OS/IO Subsystem

SQL Server

55

What could possibly be “wrong”

with this query ?

SELECT * FROM MyTable WHERE Col1 = „x‟

SELECT Col1 FROM MyTable1, MyTable2

SELECT TOP 2000000 Col1 FROM MyTable1

Looping on the Client side: WHILE @i < 10000

Update tb1 WHERE Col = @i@i = @i + 1

5656

What could possibly be wrong

with this query (cont) ?

SELECT *FROM MyTable t1INNER JOIN MyTable_2 t2 on t1.Col1 = t2.Col1INNER JOIN MyTable_3 t3 on t1.Col1 = t3.Col1LEFT JOIN MyTable_4 t4 on t1.Col1 = t4.Col1LEFT JOIN MyTable_5 t5 on t1.Col1 = t5.Col1 LEFT JOIN MyTable_6 t6 on t1.Col1 = t6.Col1LEFT JOIN MyTable_7 t7 on t1.Col1 = t7.Col1LEFT JOIN MyTable_8 t8 on t1.Col1 = t8.Col1LEFT JOIN MyTable_9 t9 on t1.Col1 = t8.Col1LEFT JOIN MyTable_10 t10 on t1.Col1 = t8.Col1 ……

5757

What is the difference?

Short Long(er) ?

IF EXISTS

(SELECT 1 FROM MyTable)

SELECT @rc=COUNT(*)

FROM MyTable

IF @rc > 0

IF EXISTS

(SELECT 1 FROM MyTable)

IF EXISTS

(SELECT * FROM MyTable)

IF EXISTS

(SELECT 1 FROM MyTable)

IF NOT EXISTS

(SELECT 1 FROM MyTable)

SELECT MyTable1.Col1,

MyTable1.Col2

FROM MyTable1

INNER JOIN MyTable2

ON MyTable1.Col1 = MyTable2.Col1

SELECT MyTable1.Col1,

MyTable1.Col2

FROM MyTable1

WHERE MyTable1.Col1 IN

(SELECT MyTable2.Col1

FROM MyTable2)

585858

What is the difference?

Short Long(er) ?

SELECT MyTable1.Col1,

MyTable1.Col2

FROM MyTable1

WHERE MyTable1.Col1 IN

(SELECT MyTable2.Col1

FROM MyTable2)

SELECT MyTable1.Col1,

MyTable1.Col2

FROM MyTable1

WHERE EXISTS

(SELECT 1

FROM MyTable2.Col1

WHERE MyTable2.Col1 =

MyTable1.Col1)

59595959

Sorting the data…

60

What is the difference?

Sort No sort

SELECT Col1

FROM Table1

UNION

SELECT Col2

FROM Table2

SELECT Col1

FROM Table1

UNION ALL

SELECT Col2

FROM Table2

SELECT DISTINCT Col1

FROM Table1

SELECT Col1

FROM Table1

SELECT Col1

FROM Table1

WHERE col2 IN (SELECT DISTINCT Col3

FROM Table2)

SELECT Col1

FROM Table1

WHERE col2 IN (SELECT Col3

FROM Table2)

CREATE VIEW VW1

SELECT * FROM DB2..Table1

ORDER BY Col1

CREATE VIEW VW1

SELECT * FROM

DB2..Table1

6161

Which one is BETTER ?

Sort No sort

SELECT Col1

FROM Table1

WHERE ModifiedDate

IN (SELECT TOP 1

FROM Table1

ORDER BY ModifiedDate

DESC)

SELECT Col1

FROM Table1

WHERE ModifiedDate =

(SELECT MAX(ModifiedDate )

FROM Table1)

6262

The OR operator

636363

What is the difference?OR No OR

SELECT Col1

FROM Table1

WHERE Col1 = „x‟

OR Col2 = „y‟

SELECT Col1

FROM Table1

WHERE Col1 = „x‟

UNION

SELECT Col1

FROM Table1

WHERE Col2 = „y‟

SELECT Col1

FROM Table1

WHERE Col1 IN

(SELECT C1 FROM Table2)

OR Col1 IN

(SELECT C2 FROM Table2)

SELECT Col1

FROM Table1

WHERE EXISTS (SELECT 1 FROM Table2

WHERE Col1 = C1)

UNION ALL

SELECT 1 FROM Table2

WHERE Col1 = C2)

SELECT *

FROM Table1

WHERE Col1 IN

(SELECT C1 FROM Table2)

OR Col2 IN

(SELECT C2 FROM Table2)

SELECT *

FROM Table1

????

64

Locks

65

•Row Locks

•Page Locks

•Table Locks

Lock granularity

6666

Row Locks

Page LocksTable Locks

Lock granularity

> 5000

locks

Principal lock types

S UX

6868

Dirty Read

•WITH (NOLOCK)

• SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

6969

Nonrepeatable Read

• Default

7070

Phantom Read

7171

ANSI Isolation Level

Dirty Reads Nonrepeatable Reads

Phantom Reads

Level 0

Level 1

Level 2

Level 3

Read uncommitted

Read committed (DEFAULT)

Repeatable reads

Serializable

SNAPSHOT

7272

Programming with isolation

level locks

Database

Transaction

Statement/table

737373

Isolation levels - example

USE pubs

GO

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

GO

BEGIN TRANSACTION

SELECT au_lname FROM authors WITH (NOLOCK)

GO

The locks generated are:

EXEC sp_lock

GO

747474

EXEC Sp_lock

SELECT object_name(85575343)

GO

-----------------------------

authors

spid dbid ObjId IndId Type Resource Mode Status

51 5 0 DB S GRANT

51 10 85575343 2 KEY (a802b526c101) RangeS-S GRANT

51 10 85575343 2 KEY (54013f7c6be5) RangeS-S GRANT

51 10 85575343 2 KEY (b200dbb63a8d) RangeS-S GRANT

51 10 85575343 2 KEY (49014dc93755) RangeS-S GRANT

51 10 85575343 2 KEY (170130366f3d) RangeS-S GRANT

51 10 85575343 2 PAG 1:1482 IS GRANT

51 10 85575343 2 KEY (c300d27116cf) RangeS-S GRANT

51 10 85575343 0 TAB IS GRANT

51 10 85575343 2 KEY (1101ed75c8f8) RangeS-S GRANT

51 10 85575343 2 KEY (2802f6d3696b) RangeS-S GRANT

51 10 85575343 2 KEY (0701fdd03550) RangeS-S GRANT

51 10 85575343 2 KEY (7f00d0d5506b) RangeS-S GRANT

7575

Temporary Objects

767676

Temporary objects

#tmp

##GlobalTmp

Tempdb..StaticTmp

@TableVariable

Table-valued functions

Common Table Extention (CTE)

View ?

FROM (SELECT …)

77777777777777

Stored Procedures…

7878

What are the benefits of

Stored Procedures? Reduce network traffic

Reusable execution plans

Efficient Client execution requests

Code reuse

Encapsulation of logic

Client independence

Security implementation

As a general rule of thumb, all Transact-SQL code should be called from stored procedures.

7979

Stored Procedures tips

SET NOCOUNT ON

No sp_

Owned by DBO

Exec databaseowner.objectname

Select from databaseowner.objectname

Break down large SPs

8080

SP Recompilations

#temp instead of @Temp table variables

DDL statements

Some set commands

Use SQL Server Profiler to check recompilations

8181

Which one is better and why?

IF @P = 0

SQL Statement Block1

ELSE

SQL Statement Block2

IF @P = 0

Exec sp_Block1

ELSE

Exec sp_Block2

8282

What could be problematic

here?CREATE PROC MySP

@p_FROM INT, @p_TO INT

AS

SELECT count(*) FROM MyTableWHERE PK between @p_FROM and @p_TO

PK

0

5

10

34

87

198,739

….

3,898,787

CREATE … WITH RECOMPILE

EXECUTE … WITH RECOMPILE

sp_recompile objname

MyTable

7 million rows

8383

Dynamic SQL… Sp_exectusql VS. execute

8484

Which one is better and why?

EXEC („SELECT Col1 FROM Table1 „ +

„WHERE „ + @WhereClause)

Exec sp_executesql @SQLString

Exec sp_executesql @SQLString,

@ParmDefinition, @PK = @IntVariable

8585

Reusable Execution (Query) Plan -

generated by sp_executesql

8686868686868686

Cursors…

8787

Cusrors - implications

Resources Required at Each Stage

88

What could possibly replace

cursors? Loops ?

Temp tables

Local variables (!)

CTEs

CASE statements

Multiple queries

AND…

89

Replacing cursor

Tip #1

Select Seq=identity(int,1,1),

Fld1,

Fld2,

……

Into #TmpTable

From Table1

Order by …

Seq Fld1 Fld2 …..

1 Aaa 45.7

2 Absb 555.0

3 Adasd 12.8

4 oioiooi 0.0

….. ….. ….. …..

9090

Replacing cursor

Tip #2

declare @var int

set @var = 0

Update Table1set @Var = Fld2 = Fld2 + @VarFrom Table1 with (index=pk_MyExampleTable)option (maxdop 1)go

91

Cursor Example…

92

TRY ME….

9393

Optimizer Hints…

94

Optimizer Hints

Most common

WITH (ROWLOCK)

WITH (NOLOCK)

WITH (INDEX = IX_INDEX_NAME)

WITH (HOLDLOCK)

SET FORCEPLAN ON

OPTION (MAXDOP 1)

Join hints (MERGE/HASH/LOOP)

Isolation levels WITH (SERIALIZABLE, READ COMMITED)

Granularity level (UPDLOCK, TABLOCK, TABLOCKX)

95959595

What is possibly wrong here?

BEGIN TRAN

UPDATE MyTable SET Col1 = ‘x’

WHERE Col1 IN

(SELECT Col1 from MyTable_2)

COMMIT TRAN

Col1

x

x

y

y

y

m

….

z

MyTable

BEGIN TRAN

UPDATE MyTable SET Col1 = ‘x’

WHERE Col1 IN

(SELECT Col1 from MyTable_2 WITH (NOLOCK) )

COMMIT TRAN

97

The Transaction Log…

T-LOG

989898

What is wrong here?

BEGIN TRAN

UPDATE MyTable SET Col1 = ‘x’

WHERE Col1 = ‘y’

IF @@ROWCOUNT <> 10

ROLLBACK TRAN

COMMIT TRAN

Col1

x

x

y

y

y

m

….

z

MyTable

1000 rows with Col1 = „y‟

99999999

What could be possibly

wrong here?

BEGIN TRAN

DELETE MyTable

COMMIT TRAN

Col1

x

x

y

y

y

m

….

z

MyTable

7 million rows

T-Log size

Concurrency

How do we “solve” this ?

What if we have a WHERE clause in the DELETE ?

100

Transaction Habits

As short as possible

Long transactions:

Reduce concurrency

Blocking and deadlocks more likely

Excess space in transaction log to not be

removed.

T-log IO

No “logical” ROLLBACKS!

101101

Triggers…

102102102102

What is wrong here?

CREATE TRIGGER TRG_MyTable_UP

ON MyTable

AFTER INSERT

AS

UPDATE MyTableSET InsertDate = getdate()

FROM MyTable

INNER JOIN inserted ON MyTable.PK = inserted.PK

PK Insert

Date

1

5

13

67

89

1234

….

345667

MyTable

103

Typical Trigger Applications

• Cascading modifications through related tables

• Rolling back changes that violate data integrity

• Enforcing restrictions that are too complex for rules or constraints

• Maintaining duplicate data

• Maintaining columns with derived data

• Performing custom recording

• Try to use constraints instead of triggers, whenever possible.

104104

Tables Design Issues…

105

Column name Type Property Key/index

Employee ID Int NOT NULL

Identity (values are unique)

Clustered

First Name Char(100) NOT NULL

Last Name Char(100) NOT NULL

Hire Date Datetime NULL

Description Varchar(8000) NULL

ContractEndDate Char(8) NOT NULL Index

SelfDescription Varchar(8000) NOT NULL default „‟

Picture Image NULL

Comments Text NULL

Application rules:

All queries fetch EmployeeID , FirstName, LastName and HireDate WHERE EmployeeIDequals or BETWEEN two values, where ContractEndDate >= getdate()

All other column are fetched only when user drills down from application

FirstName, LastName, HireDate and ContractEndDate rarely change

Comments , Description And SelfDescription are rarely filled up and they never appear in the WHERE clause

Picture column is ALWAYS updated after row already exists.

Once the contract ends, the data should be saved but will not be queried by application

Employees

table

106106

Column name Type Property Key/index

Employee ID Int NOT NULL

Identity (values are unique)

Clustered

First Name Char(100) NOT NULL

Last Name Char(100) NOT NULL

Hire Date Datetime NULL

Description Varchar(8000) NULL

ContractEndDate Char(8) NOT NULL Index

SelfDescription Varchar(8000) NOT NULL default „‟

Picture Image NULL

Comments Text NULL

Clustered

UNIQUE

Varchar(100)

Varchar(100)

Datetime

Varbinary(MAX)

Varchar(MAX)

NULL

First…

107107107

Column name Key/index

Employee ID Clustered PK

First Name

Last Name

Hire Date

ContractEndDate Index

Column name Key/index

Employee ID Clustered PK

Description

SelfDescription

Picture

Comments

Employees (active

employees)

Column name Key/index

Employee ID Clustered PK

First Name

Last Name

Hire Date

Description

ContractEndDate

SelfDescription

Picture

Comments

OldEmployees (inactive

employees)

4 different tables?

Employees details

This is vertical

partitioning…

1:1

108

Column name Type

Employee ID INT

First Name Varchar(100)

Last Name Varchar(100)

Hire Date Datetime

ContractEndDate Datetime

Column name Type

Employee ID INT

First Name Varchar(100)

Last Name Varchar(100)

Hire Date Datetime

ContractEndDate Datetime

Column name Type

Employee ID INT

First Name Varchar(100)

Last Name Varchar(100)

Hire Date Datetime

ContractEndDate Datetime

Contract Date < 2008-01-01

Contract Date >= 2008-01-01

and < 2009-01-01

Contract Date >= 2009-01-01

Horizontal partitioning

109109109

Tips for the application side…

110

Server-side cursors prior to .NET 2.0

Sorts and grouping on the client

End-user reporting

Default Transaction isolation levels

Intensive communication with database

Connection pooling

Long transactions

Ad-hoc T-SQL

SQL injection…

Beware of…

111111

Performance Audit Checklist

Does the Transact-SQL code return more data than needed?

Is the interaction between the application and the Database Server too often.

Are cursors being used when they don't need to be? Does the application uses server-side cursors?

Are UNION and UNION ALL properly used?

Is SELECT DISTINCT being used properly?

Is the WHERE clause SARGable?

Are temp tables being used when they don't need to be?

Are hints being properly used in queries?

Are views unnecessarily being used?

Are stored procedures and sp_executesql being used whenever possible?

Inside stored procedures, is SET NOCOUNT ON being used?

Do any of your stored procedures start with sp_?

Are all stored procedures owned by DBO, and referred to in the form of databaseowner.objectname?

Are you using constraints or triggers for referential integrity?

Are transactions being kept as short as possible? Does the application keep transactions open when the user is modifying data?

Is the application properly opening, reusing, and closing connections?

112

Questions/

Autographs

End of Part II…