1
Infrastructure Solutions for Microsoft SQL ServerInformation Infrastructure Solutions
2
EMC Proven SolutionsWhat to expect• Full-stack testing
– Not just interop• Shared deployment risk with EMC• Decreased deployment/testing cycles• Assured performance• Verified building blocks for scalability
3
Proven Solutions approach
Capture& Define
Test and Validate
Document Publish
Singapore Shanghai, China
Cork, Ireland
Hopkinton, MA
Santa Clara, CA
Vienna, Austria
1 2 3 4
Requ
irem
ents
4
Tiered/Unified Storage
Replication, Backup and
Recovery
Business Continuity Security
Replication Manager
NetWorker
Avamar
Data Domain
EMC Disk Library (EDL)
RSA Data Loss Prevention Suite
RSA SecureView
RSA enVision
RSA SecurID
RSA Adaptive Authentication
Symmetrix VMAX
VNX
VNXe
IOmega
Cluster Enabler
RecoverPoint
vCenter SRM
SRDF
MirrorView
Celerra Replicator
VMware vSphere
Microsoft Hyper-V
VBlock
VPLEX
Proven Solutions, White Papers and Best Practices
Solutions Overview for Microsoft Applications
Virtualization and Private
Cloud
5
SQL Server Always On - I/O Reliability Program
A Microsoft validated program for storage solution that complies with a set of core technical criteria to ensure the highest level of availability for mission critical SQL applications
‒ All EMC storage arrays adhere to and can enforce write ordering consistency
‒ Adherence to SQL Server Write Ahead Logging (WAL) Protocols with EMC’s Consistency Technology (transactional integrity)
‒ Onboard protected caching to optimize I/O operations‒ EMC storage platforms will not transition asynchronous
I/O operations from a host into synchronous ones
• Microsoft SQL Server Database Engine Input/Output Requirements• http://www.microsoft.com/sqlserver/2008/en/us/high-availability.aspx• http://www.emc.com/solutions/samples/microsoft/ms-sql-server-always-on-technologies-program.htm
6
Enterprise Flash Drives and SQL ServerWhat to expect• Decreased response time• More throughput• Smaller footprint, less power
– Enable the use of nl-SAS with FAST
7
Enterprise Flash Drives with SQL Server• Response time can be as low as
1ms (x10 faster than 15k FC disks)• Single Flash drive can deliver up to
x30 IOPS than FC disk• Smaller footprint and reduced
energy requirements by ~38%• Read intensive workloads with low
cache read-hit rate• Random I/O patterns• Small I/O requests (up to 16KB)• Extremely low latency, high
transaction throughput•
Decrease response time and improve scaling with assurance
8
Enterprise Flash Drives with SQL ServerDecrease response time with assurance• Selected Tables
– Implementing table partitioning for read intensive tables– Significant performance improvement
• TempDB– Typically generates large sequential I/O , but in some instances
I/O can be very random– Moderate performance improvement
• Index– Moderate performance improvement
• Transaction Logs– Testing has shown no performance benefit over FC+Array write
cache
9
Reference Architecture: Tiered Storage
Tiered storage design with:– CLARiiON CX4-960– Two-node active/passive failover
cluster– Storage connectivity - 8 Gb/s FC – Network connectivity - 1 GbE
Workload– OLTP with TPC-E like standard– Number of customers: 75,000– User data: 789 GB– Expected throughput: 10,000
IOPs
10
Reference ArchitectureLayout – (FC ONLY)Reference ArchitectureLayout – FAST (Flash+FC)
11
Layout with Flash & FC DrivesMaintain performance with less acquisition cost
12
Layout with Flash & FC DrivesMaintain performance with smaller footprint
13
Layout with Flash & FC DrivesThe table below highlights the results identified between our baseline configuration of 90 FC drives compared to 30 FC drives with 4 Flash
Configuration All Fibre Channel Tiered Flash/FC
Disks 90 FC 30 FC / 4 Flash
Tested TPS Baseline 2.4% Improvement
Tested IOPs Baseline 4.2% Improvement
Management Baseline 80% Less Time
Acquisition Cost Baseline 38% Less Cost
Power/Cooling Baseline 45% Less Cost
14
FAST VP for Virtualized SQL ServersReference Architecture
Storage– Two Engine VMAX– 1 GigE Ethernet network– VMFS Datastore
VMware– Four SQL VMs
(Hot/Warm)– 8 vCPU, 16GB RAM
LUN
Sub-LUN Tiering
Enterprise Flash drive
EMC Symmetrix VMAX GEN-001525
SATA
EMC Symmetrix Management
Console
VMware Cluster
VMware vCenter Server
ClientsMicrosoft
SQLServer 1
MicrosoftSQL
Server 2
MicrosoftSQL
Server 3
MicrosoftSQL
Server 4
15
Improving Performance and Efficiency
Cost Disk drives Annual power consumption*
Total cost >10 yrs
64 FC $145,280 $3,196 $177,2404 Flash + 28 SATA $99,140 $1,296 $112,100Cost saving $46,140 $1,900 $65,140% Saving 32% 60% 37%
IOPS TPS Formatted capacity (GB)
64 FC 10559 1504 144004 Flash + 28 SATA 11439 1663 14936Change +8% +11% +4%
Performance and Capacity
CAPEX and OPEX
Note: The tested 2 tiers approach is a viable option and not necessarily a best practice
disk drives Annual power consumption* Total cost over 10
years
$145,280
$3,196
$177,240 $99,140
$1,296
$112,100 64 FC
4 FLASH+ 28SATA
-32%-60%
-37%
16
PerformanceUse FAST Cache to substantially improve OLTP throughput • Improve performance
without complex data migration
– No downtime– No application impact– No schema changes
• Improvement is dependent on several factors
– Locality of Data– Ratio of Data to Cache
Over 3x
Transaction Throughput
17
EfficiencyUse FAST Cache to increase utilization while maintaining performance
Equivalent Performance
Improve Capacity
Utilization by 4x
18
Ease of UseFAST can eliminate complex data architecture projects
Traditionally Databases have hot and cold areas• Common Approach
– Manually partition database for TODAY’s workload– Problem: Complicated, downtime required, costly and only
solves it for the present. • FAST Approach
– No Manual steps required– Adapts dynamically to changing access patterns– Grow storage tiers as needed without any application level
impact
19
FAST Cache and SQL Server
• Improve storage efficiency
– Eliminate short-stroking
– Reduces power and cooling requirements
20
Improvement might not be immediate
• FAST requires time to monitor the system and move data around.
– Typically happens on a daily schedule– Does not adapt to mid-day changes in workload
• FAST Cache requires multiple accesses to promote data
– Lab testing shows few hours between Cache available, and fully used.
TPS Improvementwithin 5 hours
21
vfCacheWhat to expect• Massively decreased read response time• Massively increased scale of performance• Reduced workload on existing SAN fabric and arrays• Licensing cost savings through increased
consolidation
22
EMC VFCache Improves Server Performance
• Server Flash caching solution that uses intelligent caching software and hardware technology to reduce latency and increase throughput
• “Hottest” data accessed through VFCache in the server providing increased performance
• VFCache benefits SQL Server 2012:• Lower database I/O latency by 60%• 50% more I/O serviced within 1 ms• 4X more transactions per SQL Server
database• Perfect fit for OLTP workloads
Tran
sact
ions
Per
Min
ute
(TP
M)
Rea
d La
tenc
y (S
ec)
23
Total ProtectionWrite-through caching to the array
• Data persists down to EMC Symmetrix VMAX and EMC VNX networked storage to ensure high availability, end-to-end data integrity, data reliability, and disaster recovery
• Sharable and scalable– No stranded storage
24
Read Hit Example1. Read request from
application to an accelerated array LUN
2. VFCache Driver determines a hit occurred and accesses data from Flash device
3. Data returned from the Flash device is forwarded to the application
PCIeFlash
SANHBA
SAN storage
Application
VFCache Driver
1
2
3
25
Read Miss Example1. Read request from
application to an accelerated array LUN
2. VFCache Driver determines a miss occurred and accesses data from array LUN
3. Data is read from the array and returned to the application4. Read miss data is
written to Flash device
asynchronously
PCIeFlash
SANHBA
SAN storage
Application
VFCache Driver
1
3
42
26
Write Example1. Write request from
application to an accelerated array LUN
2. VFCache Driver writes data to array LUN
3. Application write acknowledged upon array completion
4. Write data is asynchronously written to Flash
device
PCIeFlash
SANHBA
SAN storage
Application
VFCache Driver
1
2
3
4
27
The VFCache EffectMore transactions, less waiting
Measured workload example: TPCC-like workloads on Oracle
and DB2 (1.2 TB database)
Response time 50% Throughput 210%
The VFCache
Effect
28
VFCache for SQL Server - Architecture
29
VFCache for SQL Server - Impact
30
Virtualization and Private
Cloud
Tiered/Unified Storage
Replication, Backup and
Recovery
Business Continuity Security
Replication Manager
NetWorker
Avamar
Data Domain
EMC Disk Library (EDL)
RSA Data Loss Prevention Suite
RSA SecureView
RSA enVision
RSA SecurID
RSA Adaptive Authentication
Symmetrix V-Max
VNX
VNXe
IOmega
Cluster Enabler
RecoverPoint
vCenter SRM
SRDF
MirrorView
Celerra Replicator
VMware vSphere
Microsoft Hyper-V
VBlock
VPLEX
Proven Solutions, White Papers and Best Practices
Solutions Overview for Microsoft Applications
31
Virtualizing SQLWhat to expect• Reduced Licensing Cost• Smaller footprint• More power efficiency• Assured performance• Simplified scaling• Increased operational flexibility• Simplified disaster recovery• Simplified test/dev provisioning
32
16 Physical SQL Servers Enterprise Edition 64 core licenses* Cost:
– $439,936 (minimum)
* MS licenses a minimum of 4 cores per server. Frequently it’s more.
16 Virtualized SQL Servers Enterprise Edition Two quad-core procs
– 2 vCPU per core* Cost:
– $54,992 - limited mobility (no SA)
– $96,236 - unlimited mobility (with SA)
* MS recommends up to 8 vCPUs/core with Hyper-V
~5-8x reduction in SQL licensing costs
33
Key Benefits – Server Virtualization• Consolidation - Achieve 2-10x consolidation ratio,
especially for larger deployments
• Lower TCO - Significant power, cooling and data center space
• Availability - Using a VM based protection for SharePoint provides homogeneous high availability (VMware HA, WFC)
• Business Continuity - Simplified disaster recovery management (vCenter Site Recovery Manager, Cluster Enabler)
• Maintenance - Live migration of virtual machines (VMware vMotion, Hyper-V Live Migration)
• Load Balancing - Maximized overall performance with balanced HW utilization across the farm (VMware DRS, SCVMM PRO)
• Rapid Provisioning and Scaling – Using VM templates for fast provisioning for easier scale-out
34
Multiple instances - Instance consolidation
Single instance - Databases consolidation
Multiple VMs - Hypervisor consolidation
– Requires common configuration – Preferably similar workloads– Resource contention (Memory, TempDB..)– Downtime/Maintenance impact– Limited performance management
– Per-Instance resource allocation– Workload isolation
– Better isolation– Dynamic resource management– Faster deployment– VI benefits (CPU/Memory over-commitment)
Approaches to SQL Server Consolidation
35
SQL Server Scaling In Virtual DeploymentsScale-Up Approach• Multiple databases or SQL
instances per VM– Fewer ESX Servers– Single point of failure– Larger VM
SMP overheads– OS bottleneck, especially for
32-bit environments
Scale-Out Approach• Single instance/database per VM
– Better SQL Instance and workload isolation
• DSS vs. OLTP separation– More granular change
management– DRS/VMotion more effective
with smaller VMs
VM1 VM2
ESX Server
SQL_1SQL_3SQL_5SQL_7
OS
SQL_2SQL_4SQL_6SQL_8
OS
Virtual Machines
ESX Server
SQL_2
OS
SQL_1
OS
SQL_5
OS
SQL_6
OS
Virtual Machines
ESX Server
SQL_4
OS
SQL_3
OS
SQL_7
OS
SQL_8
OS
36
SQL Server Scale up performancePhysical vs. Virtual
C̶ At 1-2 vCPUs, ESX achieves 92% of native throughputC̶ 4 vCPUs can reach 88% while 8 vCPUs 86% of native throughputC̶ At 1, 2 and 4 vCPUs on the 8pCPU server, ESX is able to effectively offload
certain tasks to idle cores.
Comparison Performance GainPhysical - 8 CPU vs. 4 CPU 1.71vSphere 4.0 - 8 vCPU vs 4 vCPU 1.67
1 2 4 80
1
2
3
4
5
6Native ESX4.0
Number of Cores/vCPUs
Thro
ughp
ut(N
orm
aliz
ed t
o 1-
CPU
Nat
ive
Resu
lt)
37
Virtualized SQL Server - Connectivity Options
‒ SQL 2008 on Windows 2008 performed similarly under virtual and physical machines
‒ The physical machine and virtual machine (MSI) saturate at 9,000 users
‒ VMFS, and RDM saturate at 8,000 users.
‒ VMFS performance drops rapidly once user saturation is reached
UsersTransactions Per Second
Physical Guest MSI VMFS RDM8,000 415 414 403 4119,000 461 460 330 425
10,000 460 456 353 375VMware ESX - Performance/Connectivity options
(iSCSI Connectivity, Avg., User response time <2.0 sec)
38
Virtualization and Private
Cloud
Tiered/Unified Storage
Replication, Backup and
Recovery
Business Continuity Security
Replication Manager
Avamar
NetWorker
Data Domain
EMC Disk Library (EDL)
RSA Data Loss Prevention Suite
RSA SecureView
RSA enVision
RSA SecurID
RSA Adaptive Authentication
Symmetrix V-Max
Symmetrix DMX
CLARiiON CX4
CLARiiON AX4
Celerra Unified Storage
IOmega
RecoverPoint
SRDF
MirrorView
Cluster Enabler
vCenter SRM
VMware vSphere
Microsoft Hyper-V
VBlock
VPLEX
Proven Solutions, White Papers and Best Practices
Solutions Overview for Microsoft Applications
39
Local Replication with EMCWhat to expect• Rapid data restores and backups regardless of data
size• Offloaded backups to increase potential
operating/maintenance time• Single management point for all your apps and
platforms• Automated repurposing for test/dev
40
SQL Operational Recovery - Know Your RPO&RTO
Daily Backup: Recovery point every 24 hoursRecovery Gap
Time-based CDP: Time indexed, but no SQL aware recovery pointsTime Based Recovery Points
(T) Time
CDP and/or CRR with SQL VDI Bookmarks: Application optimized recovery points
Checkpoint Patch Post-Patch Cache Flush Eng. Version Release
HotBackup
CheckpointVDI Snapshot
Unlimited Recovery Points with SQL Server Aware VDI Bookmarks
(T) Time
Snapshots/Clones: Recovery point every 2-6 hoursRecovery
GapRecovery
GapRecovery
GapRecovery
GapRecovery
GapRecovery
GapRecovery
GapRecovery
Gap
41
Common Interface for Multiple Recovery Scenarios
• On-demand mount allows single interface for all types of replicas• RecoverPoint CDP bookmark, CRR bookmark, Crash Consistent Point-in-Time Copy
• Time slider for crash consistent point-in-time mount• User-friendly name for ‘Any Point-in-Time’ event
– File group Restore– Allows you to restore a subset of the database at file group granularity
– Full Restore– Restores the entire user database/s. This includes the data, log files,
and, for SQL Server 2005/8, all full-text catalogs– Replace Restore
– Rapid VDI based restore that skips all the checks
(log backup, duplicate DB,duplicate filename)
– Advanced Recovery– Recovery (PiT)– No Recovery (For T-log replay)– Standby (Read-Only)– File System (Manually attach)
42
Array and SAN replication for SQLWhat to expect• Database and application layer replication
integration• Decreased bandwidth consumption with
RecoverPoint• Simplified Failover
– Push-button failover with VMware SRM– Automated failover with Cluster Enabler– Transparent failover with VPLEX
• Possibility of zero data loss RPO with synchronous replication
• Extremely rapid RTO with CE/VPLEX
43
SQL Server Availability
Recovery Point (RPO)
Rec
over
y Ti
me
(RTO
)
Native Backup/Resto
re
Log ShippingVDI/VSS
Backup/RestoreSAN
Replication (Async)
SAN Replication
(Sync)
DB Mirroring (Async)
Transactional Replication
Cluster Enabler (Async)
Cluster Enabler (Sync)
Zero-Seconds Minutes Hours
Hou
rs
M
inut
es
Seco
nds Failover Cluster
(Local)DB Mirroring (Sync)
44
SQL Server Replication– Automated restart solution, based
on Windows server Failover Clustering
– Provides high-availability on the instance level
– Active/Passive or multiple active instances/nodes
– Modules available for SRDF, MirrorView and RecoverPoint (Sync and Async)
– Can be fully automated– Supports all cluster modes (MNS,
FSW etc)– Multiple subnets support coming
in “Denali”
Stretched Failover Clustering
Site A Site B
Windows Failover Cluster
SRDFMirrorView
RecoverPoint
45
VMotion over Distance Microsoft-Oracle-SAPEnabled By VPLEX Metro
Site A• VPLEX Metro• Symmetrix VMAX• CLARiiON CX4-480
Site B• VPLEX Metro• Vblock 1 (CX4)
100Km Distance
Application / RDBMS
# VMs
VMotion (min)
SQL Server 2008 2 5:17SharePoint Server 2007
7 3:37
SAP ERP 6.0 / BW 7.0 8 1:53Oracle E-Business 12.1
2 3:52
Total 19 5:17
46
3. Replication Manager thaws SQL databases
Automated SQL Consistent Replicas at Both Sites With Physical or Virtual Hosts
SANSAN
Local CDP Copy
LUN
RecoverPointWAN SAN
Remote CRR Copy
Local CDP Journal
Remote CRR Journal
Production LUN
LUN
Replication Manager Server
1. Replication Manager freezes SQL databases (VDI)2. Replication Manager server requests bookmark to be created
RecoverPointBOOKMARK
BOOKMARK
VMware ESX Cluster
VMware ESX Server Farm
ESX1
Virtual Machine
RDM RDM
D: E:
Virtual Machine
VMFS VMFS
F: G:
4. Images are now mountable/recoverable
47
Solution Architecture4* SQL Server Virtual Machines• 2* Online Transaction Processing Databases (75,000/25,000 users)• 2* Data Warehouse / Analytics Databases (2TB/1TB)Storage Configuration• OS: RAID5 FC Pool• OLTP DB: RAID 1/0 FC Pool• DW DB: RAID 6 SATA Pool• DB Logs: RAID 1/0 FC RG• RP Journals: RAID 1/0 FC RG
J
48
Optimizing WAN with RecoverPoint
Operating System
+ PageFile
SQL System DBs +
SQL TempDB
User Databases
43mbits
3.6mbits
23.3mbits
151mbits 177.9mbits
Average compression
ratio achieved 4:1
site write rate
WAN traffic
J
49
Tiered/Unified Storage
Replication, Backup and
Recovery
Business Continuity Security
Replication Manager
NetWorker
Avamar
Data Domain
EMC Disk Library (EDL)
RSA Data Loss Prevention Suite
RSA SecureView
RSA enVision
RSA SecurID
RSA Adaptive Authentication
Symmetrix VMAX
VNX
VNXe
IOmega
Cluster Enabler
RecoverPoint
vCenter SRM
SRDF
MirrorView
Celerra Replicator
VMware vSphere
Microsoft Hyper-V
VBlock
VPLEX
Proven Solutions, White Papers and Best Practices
Solutions Overview for Microsoft Applications
Virtualization and Private
Cloud
50
THANK YOU
51
SQL Server Best Practices I/O Patterns
Generalizing SQL Server I/O patterns is difficult - sizing storage for unknown workload is not trivial
• OLTP (Online Transaction Processing)– Typical heavy on random read / writes (8K most common)
• RDW (Relational Data Warehousing)– Typical 64KB+ sequential reads (table and range scan) – 128-256KB sequential writes (bulk load)
• Operational Activities – Backup/Restore , Index rebuild etc.
In reality, “mixed” workloads are more common
52
SQL Server Best Practices Planning
• SQLIO, SQLIOSIM, IOMETER etc. are all synthetic, yet customizable, load tools.
– Ideally, resource planning should be based on an observed workload if possible
• Consider splitting workloads with very different I/O characteristics at the physical level
– Isolation at physical level can provide predictable performance
Traditional storage best practices are challenged these days with the introduction of VP, FAST and
FAST Cache
53
VNX FAST/FAST Cache• Some workloads are too dynamic for FAST
– Data access patterns can change between scheduled tiering
– One-time operations can skew the real patterns– FAST Cache may be a better choice in those
circumstances• FAST Cache isn’t needed everywhere
– Ex: Transaction Logs, Reserve LUNs, Write Intent Log – Enabling FAST Cache on these items takes resources
away from other things and hurts performance potential
54
SQL Server Best Practices Allocating Storage
– Place SQL transaction log and database files on physically separate Disk groups/Pool Avoid disk contention (random and sequential I/O) Ensure disk or volume failures do not impact both Log and Data
‒ Place database files on RAID1/0 or RAID5 volumes/pools RAID5 for more read intensive workloads (DW or when writes are
less than 30% of the workload) RAID1/0 for higher random write workloads (OLTP) RAID 6 usually for higher availability with large pools
– Plan for performance not for capacity Disk Response time IOPS (Random I/O) Throughput in MB/s (Sequential I/O)
55
SQL Server Best PracticesAllocating Storage - Log files
• Log manager activity is sequential in nature• Checkpoints are more random in nature• Disk response time is key
– Logical Disk Counters: Avg. Disk/sec Write– SQL Server Databases: (Log Flush Wait Time)/(Log
Flushes/sec)– Place transaction log files on a RAID1/0
volume/pool for lower write latency and faster rebuilds
56
SQL Server Best Practices Allocating Storage – Data Files
– To utilize more spindles, consider FILEGROUPs for database files Balance the load across multiple LUNs/RAID Groups/Pools As a general rule of thumb use .25 to 1 file per core within a
FILEGROUP Use equal size for files within a single FILEGROUP For performance benefit only, might be easier to configure
and maintain using storage virtualization‒ Consider placing Tempdb database on separate spindles depending on how well you know your workload use of it Same practices as data files with respect to sizing and growth Pre-allocate the Tempdb space with a size large enough to accommodate
the expected workload (1-10% of instance size) Set the file growth increment large enough to minimize Tempdb expansions Microsoft recommends setting the Tempdb files FILEGROWTH increment to
10%
57
SQL Server Best Practices Settings
– Plan Data files size accordingly (Virtual Provisioning!) Don’t rely on AutoGrowth
File growth can cause locking, set files size and autogrow increments appropriately
Disable Auto Shrink
– Plan for the following Disk Response TimesData files R/W Operations (Response
Time) Recommendation
< 10 ms Very Good< 20 ms Acceptable> 20 ms Need for investigation and improvement
Log file Write Operations (Response Time) Recommendation
< 5 ms Very Good5 – 10 ms Acceptable15 -20 ms Investigate and Improve
58
SQL Server Best Practices Host
– Observe and adjust Queue Depth settings on the HBA if I/O accumulation on the host level is noticed
– Proper Multipathing (Zoning/Mapping/Masking) is important for both performance and availability
– Thin Provisioned LUNs Use the “Quick Format” option !!! Enable Instant file initialization !!!
Enhances the speed for database creates, restores, data file growth Log files would be fully allocated and zeroed
Important I/O countersAverage Disk/sec Read & WriteCurrent Disk Queue Length*Disk Reads/Writes per SecondDisk Read & Write Bytes/secAverage Disk Bytes/Read & Write
* Hard to interpret due to virtualization of storage. Consider in combination with response times.
59
SQL Server Best PracticesSummary
Understanding the actual SQL server workload is CRUCIAL for Server hardware, Storage and Database optimization
Storage‒ Understand the workload type (random, sequential or mixed)‒ Monitor the average I/O size and it’s effect on the overall IOPS ‒ Always plan for peak loads‒ Provide sufficient storage bandwidth to handle consolidated workloads
Virtualization‒ Start with smaller Tier 2 databases and gradually move to larger databases‒ Some SQL instances might not be best candidates for virtualization
When more than 8 vCPUs required (4 with Hyper-V) Scale out is not an option
Monitoring/Controlling‒ SQL Profiler in correlation with Perfmon and STP, SQL Database Engine Tuning
Advisor (Index/Partitions)‒ SQL 2008 Resource Governor (cpu+memory control)‒ SQL 2008 Performance Warehouse‒ EMC Select partners – ZettaPoint (DBClassify), Precise (TPM)