Date post: | 29-Nov-2014 |
Category: |
Business |
Upload: | mark-ginnebaugh |
View: | 2,081 times |
Download: | 1 times |
Microsoft SQL ServerMicrosoft SQL ServerMicrosoft SQL ServerMicrosoft SQL ServerFiltered Filtered Indexes and Indexes and Sparse Sparse Columns:Columns:
Together Together SeparatelySeparatelyTogether, Together, SeparatelySeparatelySpeaker: Don Vilen
Chi f S i i B Si hChief Scientist, BuySight
February 2011February 2011
Mark Ginnebaugh, User Group Leaderwww.bayareasql.org
15 Feb 2011
Filtered Indexes and Filtered Indexes and Sparse Columns:Sparse Columns:
Together, Separately Together, Separately ––Together, Separately Together, Separately
Don Vilen Chief Scientist BuysightDon Vilen, Chief Scientist, [email protected]
AgendaAgendaAgendaAgenda◦ Filtered Indexes◦ Filtered Statistics◦ Wide Tables
S C l◦ Sparse Columns
T th ◦ Together …◦ … and Separately
◦ Everything is SQL Server 2008 (and later), in all editions
The ScenarioThe ScenarioThe ScenarioThe Scenario◦ 100,000 rows in the table
99 500 hi i l i i 500 99,500 rows are historical, remaining 500 rows are current Indicated by NULL EndDate column or IsActive bit, etc.
◦ All queries on current data use index◦ But why index all the historical 99.5% of the table?
◦ 1 000 columns in a table◦ 1,000 columns in a table◦ BikeColor column is relevant only if ItemType is
‘Bicycle’ For 0.5% of the rows; remainder are NULL
◦ But why index all the rows regardless of ItemTypevalue?
Filtered IndexesFiltered IndexesFiltered IndexesFiltered Indexes◦ Indexes only rows with values that match WHERE clause CREATE INDEX xyz ON table(columns, …)y ( , )
WHERE EndDate IS NULL WHERE IsActive = 1 WHERE ItemType = ‘Bicycle’
◦ Uses: Ranges of values for smaller portion of large table
Avoid the common 80-90% of data where the index wouldn’t be helpful For categories of row data
Index on Column120 and Column121 only useful when C1 = 37 Table partitions, where index is needed only on the ‘current’ partition(s)
Each partition will have the index structure, but only ‘current’ partitions will have any rows in the index
◦ Benefits Better query performance Reduction in storage costs Reduction in maintenance cost/time
Filtered Index Filtered Index –– Allowed SyntaxAllowed SyntaxFiltered Index Filtered Index Allowed SyntaxAllowed Syntax◦ WHERE <filter_predicate>[from BOL: CREATE INDEX] <filter_predicate> ::= <conjunct> [ AND <conjunct> ] <conjunct> ::= <disjunct> | <comparison> <disjunct> ::= column_name IN (constant ,…) <comparison> ::= column_name <comparison_op> constant
<comparison_op> ::= { IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !< }
◦ No BETWEEN, no LIKE, no subquery, no variables
◦ So must be simple and deterministic
Filtered Indexes Filtered Indexes –– RequirementsRequirementsFiltered Indexes Filtered Indexes RequirementsRequirements◦ Always some comparison involved, so must agree
on how operations work so requires standard on how operations work, so requires standard SET options ON for ANSI_NULLS, ANSI_PADDING,
ANSI WARNINGS ARITHABORT ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER
OFF for NUMERIC_ROUNDABORT◦ Else: If not set when index is created, won’t create the index If not set when INSERT, UPDATE, DELETE, MERGE If not set when INSERT, UPDATE, DELETE, MERGE
affects the data, gives error and rolls back If not set when the index might be used to optimize the
query, it will not be considered
Filtered Indexes Filtered Indexes –– ApplicabilityApplicabilityFiltered Indexes Filtered Indexes ApplicabilityApplicability◦ Non-clustered indexes only (rather obviously )
F UNIQUE i d l th i d d ◦ For UNIQUE indexes, only the indexed rows must have unique index values Duplicates in the non-indexed rows are not checked, but
be careful that an update to a qualifying column doesn’t be careful that an update to a qualifying column doesn t cause a duplicate to occur CREATE UNIQUE INDEX ix1 ON xyz (c3)
WHERE c2 = 10 So now there is a way to create a unique index on
column with multiple NULL values; create index WHERE ColY IS NOT NULL
Fil d i d d l ◦ Filtered indexes do not apply to: XML indexes Full-text indexes Spatial indexes
Filtered Indexes Filtered Indexes –– Getting Them Used 1Getting Them Used 1Filtered Indexes Filtered Indexes Getting Them Used 1Getting Them Used 1
◦ QO can only use the index when it knows the index will match the conditions in the query’s WHERE clausematch the conditions in the query s WHERE clause◦ Assume Column120 and Column121 useful only when
C1 = 37 So CREATE INDEX i1 on dbo t1 (Column120 Column121)So CREATE INDEX i1 on dbo.t1 (Column120, Column121)
WHERE C1 = 37 SELECT Column121
FROM dbo.t1WHERE Column120 = 13WHERE Column120 = 13
Cannot use the index even if Column120 and Column121 only appear for C1 = 37 As far as the QO knows, there may be other Column120 or Column121
values that are not in the indexvalues that are not in the index
◦ Help the QO by adding more limiting predicates to WHERE clause Make it WHERE Column120 = 13 AND C1 = 37Make it WHERE Column120 = 13 AND C1 = 37
Filtered Indexes Filtered Indexes –– Getting Them Used 2Getting Them Used 2Filtered Indexes Filtered Indexes Getting Them Used 2Getting Them Used 2
◦ WHERE with a variable rather than a literal◦ Assume index is on WHERE IsActive > 0 DECLARE @IsActive int; SET @IsActive = 1; SELECT xyz FROM table WHERE IsActive = @IsActiveSELECT xyz FROM table WHERE IsActive @IsActive
◦ QO doesn’t know value of variable, so doesn’t know if index fits So shouldn’t use variables as if they were constants
◦ Again, help the QO by adding more limiting predicates to WHERE clausep Make it WHERE IsActive = @IsActive AND IsActive > 0
B t h th t d ’t ll k hBut perhaps that doesn’t really make sense here
Filtered Indexes Filtered Indexes –– Getting Them Used 3Getting Them Used 3Filtered Indexes Filtered Indexes Getting Them Used 3Getting Them Used 3
◦ WHERE with a function or conversion on the filter predicatepredicate Obvious: WHERE ABS(C1) = 37 Cannot use index on WHERE C1 = 37 Could change it to WHERE C1 = ABS(37) if same meaning .. but not in
hi this case Implicit conversions: Assume index is WHERE c3 > 100 DECLARE @varR real; SET @varR = 1000.5;@ @ SELECT * FROM tv2 WHERE c3 = @varR Requires conversion of c3 to real before comparison, so can’t use
index SELECT * FROM tv2 WHERE c3 = cast(@varR as int)(@ ) At least it requires no conversion of c3, but is unknown value at
optimization time, so can’t use index So add a limiting predicate … assuming you know it will always be
right SELECT * FROM tv2 WHERE c3 = cast(@varR as int) AND c3 > 100
A A MisMis--Application of Filtered IndexesApplication of Filtered IndexesA A MisMis Application of Filtered IndexesApplication of Filtered Indexes
◦ Create a filtered index on c and b with WHERE on c
◦ Attempt to use the index as a validation table
◦ In code use the index in a hint and expect to get no row back for a b where c is a match, b d d h but it gets an error instead due to hint prevents a plan from being created
Filtered Indexes Filtered Indexes –– And ViewsAnd ViewsFiltered Indexes Filtered Indexes And ViewsAnd Views◦ Cannot create a Filtered index on a view, not
even a non-clustered index on an indexed view But a filtered index can be chosen by the QO for the
f d f i f tiquery formed from a view .. or function
Filtered Indexes Filtered Indexes –– Considerations 1Considerations 1Filtered Indexes Filtered Indexes Considerations 1Considerations 1
◦ Storage size differences Fewer index rows take less space Less IO, more information fits in memory 4,000 pages vs. 1 pagep g p g
◦ Limits auto-parameterization QO will not auto-parameterize if predicate is used in a
filtered index (“in most cases” per BOL) filtered index ( in most cases , per BOL) Otherwise would inhibit use of filtered index So can affect plan reuse
◦ Index maintenance – same rebuild and reorganize as regular index But hopefully much less work to doBut hopefully much less work to do
Filtered Indexes Filtered Indexes –– Considerations 2Considerations 2Filtered Indexes Filtered Indexes Considerations 2Considerations 2
◦ Covering index Consider INCLUDEing other columns so more
likely to be selected by QO
DTA fil d i d◦ DTA can suggest a filtered index ColX IS NOT NULL – only of this form But the missing indexes functionality does not flag But the missing-indexes functionality does not flag
them as missing
◦ When not to use: When non-filtered index already exists, or another
access path is likely better or adequate Avoid the extra index maintenance
Filtered StatisticsFiltered StatisticsFiltered StatisticsFiltered Statistics◦ CREATE STATISTICS stats1 ON table (cols)
WHERE <condition>◦ Uses: Can create filtered statistics on skewed data to assist QO Filtered Statistics will likely be more precise because they cover only the
data in the filtered subset (or filtered index)data in the filtered subset (or filtered index) Table partitions, where statistics are needed only on ‘current’ partition(s)
◦ Cannot reference a computed column, a UDT column, a spatial data type column, or a hierarchyID data type column
◦ AutoCreateStats will create statistics on Filtered Index key columns
◦ AutoCreateStats will not create filtered statistics on other ◦ AutoCreateStats will not create filtered statistics on other columns You have to create them yourself
◦ AutoUpdateStats will keep them updated once they are created
Metadata for Indexes, StatisticsMetadata for Indexes, StatisticsMetadata for Indexes, StatisticsMetadata for Indexes, Statistics◦ sys.indexes has_filter, filter_definition
◦ sys.stats has_filter, filter_definition
SSMS◦ SSMS Indexes and Statistics Properties have a Filter tab
Questions on Filtered Indexes, Questions on Filtered Indexes, StatisticsStatistics Any questions?y q
Now we’ll move on to Wide Tables Now we ll move on to Wide Tables, Sparse Columns
Wide TablesWide TablesWide TablesWide Tables◦ Up to 30,000 Columns Great for Sharepoint-like “a row is an object, some
attributes depend on other attributes”◦ Some limits:Some limits: Columns per non-wide table: 1,024 Columns per wide table: 30,000 Columns per SELECT statement: 4,096 Columns per INSERT statement: 4,096 Indexes per table: 1 000 Indexes per table: 1,000 Statistics per table: 30,000 BOL: Maximum Capacity Specifications for SQL Server
Wide TableWide TableWide TableWide Table◦ A wide table has defined a column set, using sparse
columnscolumns New row structure for sparse columns {column, value}, {column, value} …
Can create flexible schemas within an application Can create flexible schemas within an application Can add or drop columns whenever you want without
having to touch each row◦ The maximum size of a wide table row is 8 018 ◦ The maximum size of a wide table row is 8,018
bytes, so most of the data in a row has to be NULL Or has to be varchar-type columns so it can overflow to
another pageanother page◦ Limit is still 1,024 for number of non-sparse
columns plus computed columns, even in a wide tabletable
Wide Tables Wide Tables –– Performance ImpactPerformance ImpactWide Tables Wide Tables Performance ImpactPerformance Impact
◦ Performance considerations: Increased run-time and compile-time memory
requirementsWid t bl h t 30 000 l d fi d Wide tables can have up to 30,000 columns defined; this can increase compile time
There can be up to 1,000 indexes on a wide table, p , ,which increases the index maintenance time Nonclustered indexes should be filtered indexes to
minimize their impactminimize their impact
For more information, see BOL: Performance Considerations for Wide Tablesfor Wide Tables
Sparse ColumnsSparse ColumnsSparse ColumnsSparse Columns◦ CREATE TABLE … (…, c1 int SPARSE NULL,
…)◦ New row format for sparse columns
◦ Column: Must be NULLable Cannot be part of a cluster index
C b f k d Cannot be part of a primary key index Cannot have a DEFAULT Cannot be a computed column Cannot be a computed column
Sparse Columns Sparse Columns –– Some More Some More CannotsCannotsSparse Columns Sparse Columns Some More Some More CannotsCannots
◦ Some types cannot be sparse: geography • ntext • User-defined data types geometry • text image • timestamp
S b b l◦ Some attributes cannot be on sparse columns No Filestream
N t Id tit Not Identity Not RowGuidCol
Sparse Columns Sparse Columns –– Types and SizeTypes and SizeSparse Columns Sparse Columns Types and SizeTypes and Size◦ Size impact An important consideration but not the only one
◦ At what percentage of NULLs does a sparse At what percentage of NULLs does a sparse column take less space than a non-sparse column?
N S S N ll E iNon-Sparse Sparse Null Estimate BIT 1/8th byte 4 1/8th bytes –> 98% BIGINT 8 bytes 12 bytes –> 52%y y
See BOL: Using Sparse Columns for a complete table of typesSee BOL: Using Sparse Columns for a complete table of types
Column SetsColumn SetsColumn SetsColumn Sets◦ How do you know which columns ‘exist’ for a row?◦ You could just SELECT them; those that don’t exist are NULLYou could just SELECT them; those that don t exist are NULL◦ Can define a “Column set” Optional, only one per table
◦ Include a column: MyColSet XML COLUMN_SET FOR ALL_SPARSE_COLUMNS
◦ Selecting from MyColSet returns an XML description of the sparse columns in that row <c25>ABC</c25><c34>599</c34> <c25>ABC</c25><c34>599</c34>
◦ Can INSERT / UPDATE sparse columns by Referring to them by name as usual, or Specifying the XML for the Column_Set column
See BOL: Using Column Sets for more details
Feature / Technology SupportFeature / Technology SupportFeature / Technology SupportFeature / Technology Support◦ Sparse columns and column sets are not fully
d b SQL S h l isupported by some SQL Server technologies
◦ S arse Col mns not s orted b :◦ Sparse Columns not supported by: Merge Replication
◦ Column Sets not supported by: Replication, Distributed Query, Change Data p y g
Capture
See BOL: Using Column Sets for more details See BOL: Using Column Sets for more details
Meta Data for Sparse ColumnsMeta Data for Sparse ColumnsMeta Data for Sparse ColumnsMeta Data for Sparse Columns◦ sys.columns – is_sparse, is_column_set And in: sys.system_columns sys all columns sys.all_columns sys.computed_columns sys.identity_columns
◦ Do not confuse with sparse files as used for Database Snapshots The is_sparse in sys.database_files, sys.master_files
TogetherTogetherTogetherTogether◦ Sparse Columns together with Filtered Index◦ On Sparse column, filtered index with
xx IS NOT NULL avoids indexing all the rows with no value
◦ Makes a lot of sense, and likely the driving force behind filtered indexesB d d l◦ But not needed on every sparse column
SeparatelySeparatelySeparatelySeparately◦ Filtered Index without Sparse Column Filtered indexes on skewed data Filtered statistics on skewed data
◦ Sparse Column without Filtered Index Sparse columns on sparse data, perhaps no index to
go with it
SummarySummarySummarySummary◦ Filtered Indexes◦ Filtered Statistics◦ Wide Tables◦ Sparse Columns◦ Sparse Columns
◦ Together …Together …◦ … and Separately
◦ Don Vilen Chief Scientist, Buysight DVilen@buysight com [email protected]
To learn more or inquire about speaking opportunities, please q p g pp , pcontact:
Mark Ginnebaugh, User Group Leader [email protected]