Module 7
Designing Queries for Optimal Performance
Module Overview
• Considerations for Optimizing Queries for Performance
• Refactoring Cursors into Queries
• Extending Set-Based Operations
Lesson 1: Considerations for Optimizing Queries for Performance
• Overview of Query Logical Flow
• Using the Query Optimizer to Process Queries
• Guidelines for Building Efficient Queries
• Considerations for Creating User-Defined Functions
• Considerations for Using User-Defined Functions
• Considerations for Determining Temporary Storage
• Discussion: Optimizing a Query
Aggregate Query Aggregate Query
Non-Aggregate QueryNon-Aggregate Query
Overview of Query Logical Flow
From and Join
Rows
Where Select OrderBy
ResultSet
Groupingand
AggregationHavingResult
SetResult
SetOrder
By
Using the Query Optimizer to Process Queries
QueryOptimizer
Query
Database Schema
Query Plan
Guidelines for Building Efficient Queries
Test query variations for performanceüü
Avoid query hintsüü
Use correlated subqueries to improve performanceüü
Use table-valued, user-defined functions as derived tablesüü
Avoid unnecessary GROUP BY columns; use a subquery insteadüü
Use CASE expressions to include variable logic in a queryüü
Divide joins into temporary tables when you query large tablesüü
Favor set-based logic over procedural or cursor logicüü
Avoid using a scalar user-defined function in the WHERE clauseüü
Considerations for Creating User-Defined Functions
Consider relevant factors when indexing the results of the function
Troubleshoot and test the function
Create each function to accomplish a single task
Qualify object names referenced by a function with the appropriate schema name
Identify the type of function to be used
User-Defined Function
SELECT
FROM
WHERE
Considerations for Using User-Defined Functions
User-Defined Function
Integrate the user-defined function into the query plan as a join
Consider the balance between performance and maintainability
Avoid using a user-defined function if performance suffers tremendously
To achieve optimal tempdb performance:
• Set the recovery model of tempdb to SIMPLE• Allow for tempdb files to automatically grow• Set the file growth increment to a reasonable size• Preallocate space for all tempdb files• Create multiple files to maximize disk bandwidth• Make each data file of the same size• Load the tempdb database on a fast I/O subsystem• Consider transferring the tempdb database to a
different subsystem or disk
Considerations for Determining Temporary Storage
Discussion: Optimizing a Query
• What is the primary consideration when handling repetitive tasks against a set of data?
• What will be the effect of having the tempdb database on the same disk or Logical Unit Number (LUN) as the transaction log file?
• Can disciplined code formatting and using naming standards improve query execution performance? Explain the benefits of disciplined code formatting and using naming standards.
Lesson 2: Refactoring Cursors into Queries
• Building a T-SQL Cursor
• Common Scenarios for Cursor-Based Operations
• Demonstration: How To Refactor a Cursor
• Discussion: Using Cursors
• Guidelines for Using Result Set-Based Operations
• Selecting Appropriate Server-Side Cursors
• Selecting Appropriate Client-Side Cursors
Building a T-SQL Cursor
Use the OPEN statement to execute the SELECT statement33
Use the FETCH NEXT INTO statement to retrieve values from the next row44
Use the DECLARE CURSOR statement to define the SELECT statement 22
Issue the CLOSE and DEALLOCATE statements to close the cursor 55
Declare the variables for the data to be returned by the cursorDeclare the variables for the data to be returned by the cursor11
• Each FETCH in a cursor has the same performance as a SELECT statement• Cursors use large amounts of memory• Cursors can cause locking problems in the database• Cursors consume network bandwidth
• Each FETCH in a cursor has the same performance as a SELECT statement• Cursors use large amounts of memory• Cursors can cause locking problems in the database• Cursors consume network bandwidth
Why Cursors Are SlowWhy Cursors Are Slow
Problem Description Solution Cursor Usage
Complex Logic Difficult to translate into a set-based solution
Refactor the logic as a data driven query Rare
Dynamic Code Iteration Requires DDL code Use Transact-SQL cursors Always
List Denormalization
Converts a vertical list of values to a single comma-delimited horizontal list or string
User set-based operations, recursion, or XML queries Sometimes
Crosstab Query Building
Difficult to build by using SQL Server
Use series of case expressions or PIVOT syntax
Never*
Cumulative TotalsNeeds to be calculated within SQL Server and written to a table
Use Transact-SQL cursors Sometimes
Hierarchical Tree Navigation
Needs recursive examination of each node
Use set-based methods that use stored procedures or UDFs
Never
Common Scenarios for Cursor-Based Operations
*Constructing a dynamic cross-tab query requires using a cursor to build the columns for the dynamic SQL
Demonstration: How To Refactor a Cursor
In this demonstration, you will see how to:
Refactor a cursor
Discussion: Using Cursors
• List some of the disadvantages of using a cursor.
• What is the major issue with using a cursor in modern relational databases?
• What kind of a problem is best solved by using a cursor?
• Discuss your own experiences with cursors.
Guidelines for Using Result Set-Based Operations
Use queries that affect groups of rows rather than one row at a timeüü
Avoid making inline calls to scalar UDF in large result setsAvoid making inline calls to scalar UDF in large result setsüü
Limit query cardinality as early as possibleüü
Use result sets instead of cursor-based processes to minimize I/Oüü
Minimize the use of conditional branches inside queriesüü
Selecting Appropriate Server-Side Cursors
Static Cursor
Forward-Only Cursor
Keyset-Driven Cursor
Server-Side Cursors
Dynamic Cursor
Selecting Appropriate Client-Side Cursors
• Network latency. Client cursors use more network resources • Additional cursor types. Client cursors support only a limited functionality• Positioned updates. Client-side cursors will not reflect database changes until the
changes are synchronized with the database• Memory usage. The client computer should have enough memory to handle the size
of the entire result set
Considerations for Using Client-Side Cursors
Client Data Access Libraries That Support Client-Side Cursors
ODBC ADOADO.NET-SqlClient
OLE DB
Lesson 3: Extending Set-Based Operations
• What Are Common Table Expressions?
• Comparing CTE with Other SQL Tuning Techniques
• Demonstration: How To Use a CTE
• Discussion: Using Common Table Expressions
• Demonstration: How To Perform Recursive Queries with CTE
• Discussion: Recursion with CTEs
• Introduction to Ranking Functions
• Demonstration: How To Use Ranking Functions To Rank Rows
• What Are PIVOT and UNPIVOT Operators?
• Demonstration: How To Use PIVOT and UNPIVOT Options To Convert Data
Parameter Description
expression_name • Is used to reference the query that is using the CTE
• Can be any valid identifier
column_name • Specifies the name of a column for the CTE
• Is taken from the result set in case no column_name parameters are specified
CTE_query_definition • Specifies the SELECT statement that forms the result set
• Is followed by a SELECT, INSERT, UPDATE, or DELETE query
What Are Common Table Expressions?
A CTE is a named temporary result set based on a regular SELECT query. The following table describes the syntax parameters for a CTE
Comparing CTE with Other SQL Tuning Techniques
• A CTE does not store data anywhere until you actually execute it whereas in a temporary table, the data is stored in the tempdb database
• You must call a CTE immediately after stating whereas you can call a temporary table over and over again from within a statement
• Compute, Order By (without a TOP), INTO, Option, FOR XML, and FOR BROWSE are all not allowed in CTE whereas these options are supported in a temporary table
CTE vs Temporary Table
• In the CTE, the result set will be evaluated just once when a query is executed whereas in a subquery the result set will be evaluated every time a query is executed
CTE vs Subquery
Demonstration: How To Use a CTE
In this demonstration, you will see how to:
Create and use a CTE
Discussion: Using Common Table Expressions
• How does a CTE differ from a #Temp table?
• Can you execute two or more queries against a CTE?
• How does a CTE differ from a derived table?
• Can you build indexes or constraints on a CTE?
Demonstration: How To Perform Recursive Queries with CTEs
In this demonstration, you will see how to:
Perform recursive queries with CTEs
Discussion: Recursion with CTEs
• What is the maximum number of recursive levels in a common table expression (CTE)?
• What is the default number of recursions in a recursive common table expression?
• Assuming that each recursion adds only one row to the results, how many rows will be returned with OPTION MAXRECURSION(100)? Select an option from the following:
• 99• 100• 101
Introduction to Ranking Functions
Ranking functions return a ranking value for each row in a partition
Function Description
RANK Returns the rank of each row within the partition of a result set
NTILE Distributes the rows in an ordered partition into a specified number of groups
DENSE_RANK Returns the rank of rows within the partition of a result set, without any gaps in the ranking
ROW_NUMBER Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition
Demonstration: How To Use Ranking Functions To Rank Rows
In this demonstration, you will see how to:
Use Ranking Functions to rank rows
Parameter Description
table_source Is the name of the table that you need to pivot
aggregate_function Is a system or user-defined aggregate function that applies to the specified value_colum
pivot_column Is the source column that provides the values for the new crosstab column
column_list Is a list of values of pivot_column to display as the crosstab column headers
table_alias Is the name of the resulting result set
PIVOT is used to generate crosstab queries in which values are converted to column headers. UNPIVOT is used to convert column headers to values. The following table describes the parameters in the PIVOT and UNPIVOT syntax.
What Are PIVOT and UNPIVOT Operators?
Demonstration: How To Use PIVOT and UNPIVOT Options To Convert Data
In this demonstration, you will see how to:
Use PIVOT and UNPIVOT options to convert data
Lab 7: Designing Queries for Optimal Performance
• Exercise 1: Optimizing Query Performance
• Exercise 2: Refactoring Cursors into Queries
Estimated time: 60 minutes
Logon Information
Virtual machine
User name
Password
NYC-SQL1
Administrator
Pa$$w0rd
Lab Scenario
You are a lead database designer at QuantamCorp. You are working on the Human Resources Vacation and Sick Leave Enhancement (HR VASE) project that is designed to enhance the current HR system of your organization. This system is based on the QuantamCorp sample database in SQL Server 2008.
The main goals of the HR VASE project are as follows:
• Provide managers with current and historical information about employee vacation and sick leave.
• Grant view rights to individual employees to view their vacation and sick leave balances.
• Provide permission to selected employees in the HR department to view and update the vacation and sick leave details of employees.
• Grant the HR manager with the view and update rights to all the data.
You are working on a project to integrate HR VASE with an intranet site which is used to send email broadcast to external people. The details of email recipients are loaded from QuantamCorp HR VASE into the system named Baldwin2.
Recently, a number of functions at Baldwin2 receive many complaints about the performance. You are assigned to help fine tune the performance of the SQL used by those functions.
Module Review and Takeaways
• Review Questions
• Real-World Issues and Scenarios