UPDATING YOUR SQL SERVER 2005 SKILLS TO...

transcript

UPDATING YOUR SQL SERVER 2005 SKILLS TO SQL SERVER 2008

Microsoft TechNet Academy

SQL Server 2008 Data Warehousing Enhancements

• ETL Enhancements

• Using Partitioned Tables

• Optimizing Storage

ETL Enhancements

• What Is Change Data Capture?

• How to Use Change Data Capture

• What Is the MERGE Statement?

• How to Use the MERGE Statement

• Integration Services Enhancements

• Data Profiling with Integration Services

What Is Change Data Capture?

• Improves incremental loads

• Captures insert, update, and delete commands

• Enabled at the database and table levels

• Stores before and after images for updates

• Stores changes in a relational format

• Supports configuration to enable queries to access all changes or only net changes

• Includes tables and functions created to support storing and retrieving changed data

• Depends on the SQL Server Agent service and jobs to manage and clean up the change data capture

How to Use Change Data Capture

• sys.sp_cdc_enable_db—enable change data capture in a database

• sys.sp_cdc_enable_table—enable change data capture on a table

• cdc.fn_cdc_get_all_changes_<capture_instance>—query all changes made to the CDC-enabled table

• cdc.fn_cdc_get_net_changes_<capture_instance>—query for net changes made to the CDC-enabled table

Performs multiple actions based on the results of the join

Joins a data source with a target table or view

What Is the MERGE Statement?

Transact-SQL command

Option_Transactions

Emp_Option_Total

INSERT Option_Transactions

VALUES (1, 3, '1/1/2008')

How to Use the MERGE Statement

MERGE Production.ProductInventory AS pi

USING (SELECT ProductID, SUM(OrderQty) FROM

Sales.SalesOrderDetail sod

JOIN Sales.SalesOrderHeader soh

ON sod.SalesOrderID = soh.SalesOrderID

AND soh.OrderDate = GETDATE()

GROUP BY ProductID) AS src (ProductID, OrderQty)

ON (pi.ProductID = src.ProductID)

WHEN MATCHED AND pi.Quantity - src.OrderQty <> 0

THEN UPDATE SET pi.Quantity = pi.Quantity - src.OrderQty

WHEN MATCHED AND pi.Quantity - src.OrderQty = 0

THEN DELETE; --Update qty or delete product when qty = 0

Defines the target table

Defines the data source

Defines the value used to match rows

Defines the action to take based on the results from the ON clause

Integration Services Enhancements

Enhanced performance and caching for the Lookup transformation

Improved Import and Export Wizard

New Date and Time Data Types

Data Profiling with Integration Services

• Data Profiling task:

Used to profile data stored in SQL Server

Can identify potential inconsistencies and problems with data quality

Can identify problems within an individual column and with relationships between columns

• Data Profile Viewer:

Is a stand-alone application used to display profile output

Provides both summary and detailed data with drilldown capabilities

Using Partitioned Tables

• Enhancements to Partitioning in SQL Server 2008

• How Partitioned Queries Are Processed

• How to Switch Partitions

Enhancements to Partitioning in SQL Server 2008

• Improved query processing on partitioned tables

• Partition-aware seek operations

• Enhanced partitioning information in both compile-time and run-time execution plans

• More efficient data transfer between tables provided by partition switching:

Assign an existing table as a new partition in a partitioned table

Switch a partition from one partitioned table to another

Move a partition from a partitioned table to form a single table

How Partitioned Queries Are Processed

• Parallel query plan strategies:

Single-thread-per-partition strategy—default behavior

Multiple-thread-per-partition strategy—enabled by using trace flag 2440

• Partition-aware seek operation:

PartitionID—a hidden computed column used internally to represent the ID of the partition that holds a particular row

Skip seek operation—seeks rows based on PartitionID, then performs a secondary seek on a different condition, only on the first-level seek results

Multiple-thread-per-partition strategy

Partition 1 Partition 3 Partition 4Partition 2

How to Switch Partitions

• Verify the following:

The source table and the target table both exist before performing the SWITCH operation

An empty partition exists to receive the data if the target table is partitioned

The entire table is empty if the target table is not partitioned

Both tables are partitioned on the same column if switching between partitioned tables

Both the source and target tables share the same filegroup

• Use the ALTER TABLE command with the SWITCH statement

ALTER TABLE FactInternetSalesPartitioned

SWITCH PARTITION 1 TO

FactInternetSalesArchive PARTITION 1;

Optimizing Storage

• What Are Sparse Columns?

• How to Compress Data and Backups

What Are Sparse Columns?

• An efficient way to manage object models that frequently contain numerous NULL values

• Contain column sets:

Group of sparse columns in a table

Not physically stored

Editable

CREATE TABLE Products (Product_Num int, Price decimal(7,2), ..., Color char(5) SPARSE NULL, Width float SPARSE NULL)

CREATE TABLE Products (Product_Num int, Price decimal(7,2), ..., Color char(5) SPARSE NULL, Width float SPARSE NULL,Details xml COLUMN_SET FOR ALL_SPARSE_COLUMNS)

How to Compress Data and Backups

• Data compression can be enabled on tables, indexes, indexed views, and partitions

• The following data compression types can be defined:

Row compression

Page compression

• Backup compression:

Normally decreases time required to perform a backup

Managed with Transact-SQL, Backup Task, Maintenance Plan Wizard, or Integration Services Backup Database task

SQL Server 2008 Reporting Services Enhancements

• Reporting Services Architecture and Management

• Authoring Reports

• Report Processing and Rendering

Reporting Services Architecture and Management

• SQL Server 2008 Reporting Services Architecture

• Management Tools for Reporting Services

SQL Server 2008 Reporting Services Architecture

HTTP Listener RPC

Authentication

Report Manager Web Service Background Processing

Front-end access to report server items and operations

User interface extensions

Report processing

Model processing

All (authentication, data, rendering, report processing)

Report processing

Model processing

Data, rendering, report processing

Scheduling

Subscription and delivery

Database maintenance

ASP.NET ASP.NET

Application Domain Management Memory Management

Service Platform

Reporting Services

WMI Provider

HTTP.sys

External

Internal

Feature

Management Tools for Reporting Services

• Reporting Services Configuration Manager:

Manage configuration settings such as key backup, Web service URL, database connections, and more

• Report Manager:

Manage a native mode report server

Web-based

Manage a single instance over HTTP

Independent of Internet Information Services

• SQL Server Management Studio:

Used to manage security, scheduled jobs, and server properties

No longer used to manage folder hierarchy or Report Server content

Authoring Reports

• What Is Report Designer?

• How to Use Report Designer

• Chart Region Enhancements

• What Is the Tablix Data Region?

•Using Tablix Data Regions

• Hosted in Business Intelligence Development Studio

• Preview by using a local data cache

• Support for the following data sources:

Microsoft SQL Server/Analysis Services

OLE database

Oracle

Open Database Connectivity

Report Server Model

SAP NetWeaverBI

Hyperion Essbase

TERADATA

What Is Report Designer?

How to Use Report Designer

Configure the report

Design the report

Preview reports in the Report Designer

Manage data, including multiple data sets

Chart Region Enhancements

• Expanded user interface

• Support for formulas in chart data regions

• Context menus for each chart element

• Support for text editing directly in the chart

•Chart-type dialog box

• Ability to display multiple series on a single axis

What Is the Tablix Data Region?

• Combines features of both table and matrix formats

• Provides flexible grid layout

• Provides flexible groupings for rows and columns

• Supports hidden areas that become visible when another area is selected

• Provides the ability to create interactive reports

Using Tablix Data Regions

Drill down to subcategory

Subtotals

Nested groups

Multiple column groups

Report Processing and Rendering

• Report Processing Architecture Enhancements

• What Is On-Demand Report Processing?

• Report Rendering Enhancements

Report Processing Architecture Enhancements

• Fundamental changes to the report-processing architecture

• Provides on-demand report processing

• Existing reports are automatically upgraded to the new RDL schema when they are published on a server running SQL Server 2008 Reporting Services

What Is On-Demand Report Processing?

• Significantly reduces memory usage of the report server

• Supported through the Report Server Web service

Report Rendering Enhancements

• New object model that supports on-demand processing

• Report rendering to Windows Forms, Web Forms, CSV, XML, PDF, Image, and Office Excel:

CSV format supports data-only content

Can render subreports and nested data regions to Office Excel

• Provides more consistent paging across different rendering formats

SQL Server 2008 Analysis Services Enhancements

• Multidimensional Analysis with SQL Server 2008 Analysis Services

• Data Mining with SQL Server 2008 Analysis Services

Multidimensional Analysis with SQL Server 2008 Analysis Services

• Cube Wizard

• Dimension Wizard

• The Attribute Relationship Designer

• The Aggregation Designer

•AMO Warnings

Cube Wizard

• By using the new Cube Wizard, you can:

Use a more efficient interface

Create a cube based on a single denormalized table

Create a cube based on a data source that has only linked dimensions

Dimension Wizard

• By using the new Dimension Wizard, you can:

Create dimensions more efficiently

Automatically detect parent-child hierarchies

Provide safer default error configuration

Set member properties while creating the dimension

The Attribute Relationship Designer

Flexible relationship

Rigid relationship

Manage relationships

The Aggregation Designer

• In the new Aggregation Designer:

Aggregation designs are shown grouped by measure group

A new view is available for manual aggregation design

• The improved Usage-Based Optimization Wizard has:

The ability to append new aggregations to an existing aggregation design

The ability to modify storage settings for one or more partitions simultaneously

• Improved Aggregation Design Wizard

AMO Warnings

Available for logical errors in database design, when users depart from design best practices, and for nonoptimal aggregation designs

Nonintrusive warning messages appear when you pause the mouse over a warning

Data Mining with SQL Server 2008 Analysis Services

• Summary of Data-Mining Enhancements

• Separating Test and Training Data

• Filtering Model Cases

• Cross-Validation of Mining Models

•Data Mining Add-ins for the Microsoft Office System

Summary of Data-Mining Enhancements

• Improve both short-term and long-term predictions with the Microsoft Time Series algorithm, which has been enhanced to support both ARTXP and ARIMA algorithms

• Enable drill-through on a mining structure to use queries about the cases for both training and testing

• Create column aliases to make it easier to understand column content. In the Data Mining Designer, the alias appears in parentheses next to the column usage label

• Separate test and training data

• Improve performance and analyze different scenarios by using Model Case Filtering

• Create cross-validation reports

•Utilize Data Mining Add-ins for the 2007 Microsoft Office system

Separating Test and Training Data

• There are three ways to partition data into training and test sets:

The Data Mining Wizard

Modifying the properties of the mining structure:

• You can modify the HoldoutMaxCases, HoldoutMaxPercent, and HoldoutSeed properties in the Properties window

DMX statement, AMO, or XML DDL

Filtering Model Cases

• Limit the cases that are used in a model based on any attribute that is included in the model

• Use to compare subsets of your data, such as different regions

Cross-Validation of Mining Models

• Additional method to test and view mining-model accuracy

• Data is partitioned into cross-sections that are used to train and test models against each of the other cross-sections

• One portion of the data is used to test the model; the remaining data is used to train the model

• To define a cross-validation report, you must configure:

The Fold Count, which specifies the number of folds or partitions into which the data is broken for testing

The Max Cases, which specifies the maximum number of cases to use (a value of 0 specifies that all cases will be used)

The Target Attribute, which defines the column or attribute that you want to predict

The Target State, which defines a particular value that you want to analyze within the target attribute

Data Mining Add-ins for the Microsoft Office System

• Three separate components:

•Data Mining Client for Excel—create, test, and manage data-mining projects in Office Excel 2007

•Table Analysis Tools for Excel—use powerful analysis tools such as analyzing key influencers, highlighting exceptions, and forecasting for data that is stored in spreadsheets

•Data Mining Templates for Visio 2007—render decision trees, regression trees, cluster diagrams, and dependency nets in diagrams created in Visio 2007

QUESTIONS?

UPDATING YOUR SQL SERVER 2005 SKILLS TO...

Documents