Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | amber-brook-osborne |
View: | 214 times |
Download: | 0 times |
Large Data Operations OverviewLarge Data Operations Overview
Updates & DeletesModifying large row counts can be very slow?
Dropping indexes improves performance?
Inserts – See SQLDev.NetCovered in various presentations by Gert Drapers
Execution Plan with IndexesExecution Plan with Indexes
1. Insert multiple rows into table with clustered index2. Rows are spooled3. Nonclustered indexes are modified from the spooled data
123
Operations with indexes in place should be fasterException - large inserts where bulk log requirements are met
Execution Plan Cost Formula Execution Plan Cost Formula ReviewReview
Table Scan or Index Scan
I/O: 0.0375785 + 0.000740741 per pageCPU: 0.0000785 + 0.0000011 per row
Index Seek – Plan Formula
I/O Cost = 0.006328500 + 0.000740741 per additional page(≤1GB)
= 0.003203425 + 0.000740741 per additional page(>1GB)
CPU Cost = 0.000079600 + 0.000001100 per additional row
Bookmark Lookup
I/O Cost = multiple of 0.006250000 (≤1GB)
= multiple of 0.003124925 (>1GB)
CPU Cost = 0.0000011 per row
Insert, Update & Delete
IUD I/O Cost ~ 0.01002 – 0.01010 (>100 rows)
IUD CPU Cost = 0.000001 per row
Plan Cost – Unit of MeasurePlan Cost – Unit of Measure
Time in seconds? CPU time?0.0062500sec -> 160/sec
0.000740741 ->1350/sec (8KB)->169/sec(64K)-> 10.8MB/sec
S2K BOL: Administering SQL Server, Managing Servers,Setting Configuration Options: cost threshold for parallelism OptQuery cost refers to the estimated elapsed time, in seconds, required to execute a query on a specific hardware configuration.
Too fast for 7200RPM disk random I/Os.
About right for 1997 sequential disk transfer rate?
Test TableTest Table
CREATE TABLE M3C_00 (ID int NOT NULL, ID2 int NOT NULL,ID3 int NOT NULL, ID4 int NOT NULL,ID5 int NOT NULL, ID6 int NOT NULL,SeqID int NOT NULL,DistID int NOT NULL,Value char(10) NOT NULL,rDecimal decimal (9,4) NOT NULL,rMoney money NOT NULL,rDate datetime NOT NULL, sDate datetime NOT NULL )
CREATE CLUSTERED INDEX IX_M3C_00 ON M3C_00 (ID) WITH SORT_IN_TEMPDB
10M rows in table, 99 rows per page, 101,012 pages, 808MB100K rows for each distinct value of SeqID and DistIDCommon SeqID values are in adjacent rowsCommon DistID values are in separate 8KB pages (100 rows apart)
Data Population ScriptData Population ScriptDECLARE @BatchStart int, @BatchEnd int , @BatchTotal int, @BatchSize int, @BatchRow int, @RowTotal int, @I int , @p int, @sc1 int, @dv1 intSELECT @BatchStart = 1, @BatchEnd = 1000, @BatchTotal = 1000, @BatchSize = 10000SELECT @RowTotal = @BatchTotal*@BatchSize , @p = 100, @sc1 = 100000SELECT @I = (@BatchStart-1)*@BatchSize+1 , @dv1 = @RowTotal/@sc1WHILE @BatchStart <= @BatchEnd BEGIN BEGIN TRANSACTION SELECT @BatchRow = @BatchStart*@BatchSize WHILE @I <= @BatchRow BEGIN INSERT M3C_00 (ID,ID2,ID3,ID4,ID5,ID6,SeqID,DistID,Value,rDecimal,rMoney,rDate,sDate) VALUES ( @I, @I, 1+(@I-1)*@p/@RowTotal+((@I-1)*@p)%@RowTotal, (@I-1)%(@sc1)+1, (@I-1)/2+1, (@I-1)%320+1, (@I-1)/@sc1+1, (@I-1)%(@dv1)+1, CHAR(65+26*rand())+CHAR(65+26*rand())+CHAR(65+26*rand()) +CONVERT(char(6),CONVERT(int,100000*(9.0*rand()+1.0)))+CHAR(65+26*rand()), 10000*rand(), 10000*rand(), DATEADD(hour,100000*rand(),'1990-01-01'), DATEADD(hour,@I/5,'1990-01-01') ) SET @I = @I+1 END COMMIT TRANSACTION CHECKPOINT PRINT CONVERT(char,GETDATE(),121)+‘ row ' + CONVERT(char,@BatchRow)+' Complete'SET @BatchStart = @BatchStart+1END
Data Population Script NotesData Population Script Notes
Double While LoopEach Insert/Update/Delete statement is an implicit transaction
Gets separate transaction log entry
Explicit transaction – generates a single transaction log write (max 64KB per IO)
Single TRAN for entire loop requires excessively large log file
Inserts are grouped into intermediate size batches
IndexesIndexes
CREATE INDEX IX_M3C_01_Seq ON M3C_01 (SeqID) WITH SORT_IN_TEMPDBCHECKPOINT
CREATE INDEX IX_M3C_01_Dist ON M3C_01 (DistID) WITH SORT_IN_TEMPDBCHECKPOINT
UPDATE STATISTICS M3C_01 (IX_M3C_01_Seq) WITH FULLSCANUPDATE STATISTICS M3C_01 (IX_M3C_01_Dist) WITH FULLSCAN
Common SeqID values are in adjacent rowsCommon DistID values are in separate 8KB pages (100 rows apart)
Test QueriesTest Queries
-- Sequential rows, table scanSELECT AVG(rMoney) FROM M3C_01 WHERE SeqID = 91
-- Sequential rows, index seek and bookmark lookupSELECT AVG(rMoney) FROM M3C_01 WITH(INDEX(IX_M3C_01_Seq)) WHERE SeqID = 91
-- Distributed rows, table scanSELECT AVG(rMoney) FROM M3C_01 WHERE DistID = 91
-- Distributed rows, index seek and bookmark lookupSELECT AVG(rMoney) FROM M3C_01 WITH(INDEX(IX_M3C_01_Dist)) WHERE DistID = 91
Execution Plans - Select Execution Plans - Select
Table scan involves 101,012 pagesBookmark Lookup involves 100,000 rows1 BL ~3.6X more expensive than 1 page in Table Scan
Table Scan Cost DetailTable Scan Cost Detail
Table Scan Formula
I/O: 0.0375785 + 0.000740741 x 101,012 = 74.8CPU: 0.0000785 + 0.0000011 x 10M = 11.0
I/O and CPU cost occasionally show ½ the expected value, but combined cost shows the expected value
Index and Bookmark DetailsIndex and Bookmark Details
Bookmark Lookup
I/O: 0.003124925x100Kx0.998
= 311.87
CPU: 0.0000011x100K = 0.11
Measured Query TimesMeasured Query Times
SELECT query 100K rows
Sequential rows
Sequential rows
Distributed rows
Distributed rows
256M Server mem Index + BL Table Scan Index+BL Table Scan
Query time (sec) 0.3 10.5 167 10.5
Rows or Pages/sec 333,333(R) 9,620(P) 599(R) 9,620(P)
Disk IO/sec Low ~1,200 ~600 ~1,200
Avg. Byte/Read N/A 64K 8K 64K
1154MB Server mem
Query time 0.266 1.076 0.373 1.090
Rows or Pages/sec 376,000 93,877 268,000 92,672
Test System: 2x2.4GHz Xeon, data on 2 15K disk drives
Disk Bound Select Query CostDisk Bound Select Query Cost
Performance limited by disk capability
Random 300/disk (small portion of 18GB drive & high queue depth)
Sequential 38MB/sec (Seagate ST318451, first generation 15K drive)
Disk drive random I/O ~2X gain since mid-1990’sSequential I/O ~ 5X
Cost formulas underestimate current generation disk drive sequential performance relative to randomHowever, SQL Server cost formulas do not reflect in-memory costs
Update OperationUpdate Operation
Update DetailsUpdate Details
Actual Cost - Update Actual Cost - Update
UPDATE query - 100K rows
Sequential rows
Sequential rows
Distributed
rows
Distributed rows
256M server mem Index Table Scan
Index Table Scan
Query time (sec) 1.3 12.6 476.6 28
Checkpoint time (sec)
0.4 0.6 14.5 8
Rows /sec 57,471 7,576 203 2,778
1154MB server mem
Query time (sec) 0.8 1.3 0.9 1.5
Checkpoint time (sec)
0.2 0.1 23 23
Rows /sec 100,000 71,429 4,184 4,082
Update VariationUpdate Variation
Default plan is now a table scanColumn value is not in the index, so a bookmark lookup is requiredHowever – data page must be loaded into buffer cache before it can be modified regardless!!
Delete OperationDelete Operation
Delete DetailsDelete Details
Delete Details (2)Delete Details (2)
Delete - Actual CostsDelete - Actual Costs
Delete query - 100K rows
Sequential rows
Sequential rows
Distributed rows
Distributed rows
256M Server mem Index Table Scan Index Table Scan
Query time (sec) 4.8 88.52 282 41
Checkpoint time (sec)
8.4 4.52 8.4 14
Rows / sec 7,576 1,075 340 1,800
1154MB Server mem
Query time (sec) 4.1 6.4 4.2 5.3
Checkpoint time (sec)
3.7 3.9 28.6 28.6
Rows /sec 12,821 9,708 3,048 2,949
Delete–no indexesDelete–no indexes
Delete query, no index 100K rows Sequential rows Distributed rows
256M server mem Table Scan Table Scan
Query time (sec) 11.5 26
Checkpoint time (sec) 0.1 4
Rows / sec 8,621 3,300
1154MB server mem
Query time (sec) 1.9 1.5
Checkpoint time (sec) 0.2 22
Rows /sec 47,619 4,255
Delete with Foreign KeysDelete with Foreign Keys
SummarySummary
When large updates and deletes are slow
Examine the execute plan
Look for nonclustered index seeks on modified tables with high row count
Use index hint to force table scan
Additional InformationAdditional Information
www.sql-server-performance.com/joe_chang.asp
SQL Server Quantitative Performance AnalysisSQL Server Quantitative Performance AnalysisServer System ArchitectureServer System ArchitectureProcessor PerformanceProcessor PerformanceDirect Connect Gigabit NetworkingDirect Connect Gigabit NetworkingParallel Execution PlansParallel Execution PlansLarge Data OperationsLarge Data OperationsTransferring StatisticsTransferring StatisticsSQL Server Backup Performance with Imceda LiteSpeedSQL Server Backup Performance with Imceda LiteSpeed