© 2007 Microsoft Corporation. All rights reserved
Nicholas DritsasPrincipal Program ManagerMicrosoft Corporation
© 2007 Microsoft Corporation. All rights reserved
Who is SQL Customer Advisory Team (SQL CAT)
Overview of large AS projects
Lessons Learned
People and Infrastructure
Performance
Improving Processing Performance
Improving Query Performance
Performance MDX Tips
Scale up vs. Scale Out
© 2007 Microsoft Corporation. All rights reserved
Works with largest, most complex SQL Server projects worldwide
US: NASDAQ, USDA, Verizon, Raymond James…Europe: LSE, Barclay’s CapitalAPAC: NUL, KT, Western Digital, JR East
Drives enterprise requirements back into SQL Server
Shares best practices with SQL Server community
SQL CAT Blog: http://blogs.msdn.com/sqlcat
SQL ISV PM Blog: http://blogs.msdn.com/mssqlisv/
You will see us at PASS & TechED
© 2007 Microsoft Corporation. All rights reserved
Hilton HotelsForecasting and budgeting application for their 2,400 + hotels. Real time OLAP, scale out AS and RS
Danske Supermarket2TB AS DB, 500GB largest cube size, 800 users, 10 concurrent queries
Retail Sales AnalysisSales reporting and analysis for multiple business. 30+ terabyte relational warehouse, scale up AS, use of RS and ProClarity
Inventory Tracking and AnalysisPurchase and inventory tracking and analysis. Teradata backend with ROLAP AS, RS, and ProClarity
Financial Institution Account ManagementComplex design with linked cubes and parent child dimensions40M Member customer dimension
© 2007 Microsoft Corporation. All rights reserved
MDX Expert, MDX Expert, MDX ExpertMDX Syntax, Profiler analysis (subcube usage etc.)
Analysis ServicesAS2000 expertise != AS2005 expertise
SQL Server
Business Domain Expert
The usual infrastructure guru with knowledge of:
64-bit
NUMA
SAN / DISKS
Performance Monitor Tools and analysis
© 2007 Microsoft Corporation. All rights reserved
SQL Server Profiler
Performance Monitor
ASCMD.exe (http://msdn2.microsoft.com/en-us/library/ms365187.aspx)Command line execution tool for batch processing
Custom Aggregation ToolAvailable as a sample with SP2 / on CodePlex
Browse aggregations
Create custom aggregations based on the query log or completely from scratch
Stress testing toolsBased on Visual Studio for Testers (Okracoke)
Web tests for Reporting Services
Query tests for MDX
ASLoadSim – just posted to Codeplex
© 2007 Microsoft Corporation. All rights reserved
64-bit is critical on large implementations because of the extra memory it can access
If the warehouse is not memory constrained, 64-bit might actually be slower
Older 64-bit chips run at much slower clock speeds than current 32-bit chips
Newer dual core chips run much faster than older 64-bit chips
This is not a SQL issue – it is true of any app that moves from 32-bit to 64-bit
© 2007 Microsoft Corporation. All rights reserved
To take full advantage of new AS2005 features, redesign rather than migrate from AS2000 to AS2005
AS2005 is a very different product and cubes may need redesign to take advantage of new features
Calculations are often handled differently and may need to be rewritten (especially to use the Scope statement)
Many customers are going from 32-bit AS2000 to 64-bit AS2005
© 2007 Microsoft Corporation. All rights reserved
Who is SQL Customer Advisory Team (SQL CAT)
Overview of large AS projects
Lessons Learned
People and Infrastructure
Performance
Improving Processing Performance
Improving Query Performance
Performant MDX Tips
Scale up vs. Scale Out
© 2007 Microsoft Corporation. All rights reserved
Improve the queries that are used for extracting data from the source system
Use query binding to optimize partition queries
Use INT for keys whenever possible
Use SP2 !!Many processing improvements were included in SP2
Don’t let UI default for parallel processingGo into advanced processing tab and change it
Parallel tasks: 1.5 – 2 X number of processors
Also set max number of data source connections
Utilize partitions to limit processing scope
Avoid too many aggregations
© 2007 Microsoft Corporation. All rights reserved
Use incremental updates whenever possible
For best performance use ASCMD.EXE and XMLA
Use <Parallel> </Parallel> to group processing tasks together until Server is using maximum resources
Proper use of <Transaction> </Transaction>
ProcessFact and ProcessIndex separately instead of ProcessFull
Different CPU usage pattern
© 2007 Microsoft Corporation. All rights reserved
Watch for long running queries that may block processing
Lock waits counter for AS
Ensure sufficient memory to avoid temp file
Temp file rows (or bytes) written/sec
Avoid requesting too many operations in parallel
Quota blocked counter for AS
Memory grants pending for SQL engine
© 2007 Microsoft Corporation. All rights reserved
Define natural hierarchiesMost common cause of query performance problems
Create cascading attribute relationships and define natural hierarchies
Remove redundant relationships whenever possible
Partition cube dataReduce cube space touched by a query
Increase query parallelism
Partitioning strategy must match query pattern
Define custom aggregations when neededCapture a workload using the Query Log
Use the Aggregation Manager sample and have it read the query log
Avoid too many aggregations
© 2007 Microsoft Corporation. All rights reserved
Write performant MDXRead Nicholas Dritsas blog on http://blog.msdn.com/sqlcat
Read Mosha Pasumansky blog on http://www.sqljunkies.com/WebLog/mosha
Make sure you have MDX expert on project
Keep cube space as small as possible Only include measure groups that are needed
Use multiple measure groups when measures are not queried together
Calculated MembersOptimize according to following MDX tips
Some can be done using SCOPE in MDX
Use NON_EMPTY_BEHAVIOR whenever possible
Warm the cache
© 2007 Microsoft Corporation. All rights reserved
Monitor for competing processes, such as processing events and long running queries
Monitor for insufficient memory – cached data being pushed out of memory
Cleaner Memory Nonshrinkable KB counter in AS
Use SQL Server Profiler to determine if bottleneck is in storage engine or query execution engine
Add all duration values (in the duration column) for the Query End event for each subquery generated by the MDX query to determine total storage engine time
Subtract from total execution time to get total query execution engine time
Reduce cube space before calculationsTotal cells calculated counter in AS
© 2007 Microsoft Corporation. All rights reserved
© 2007 Microsoft Corporation. All rights reserved
How many partitions should you have?In general more smaller partitions are better than a few large partitions until you get over 2,000 (dependent on your hardware)
More CPUs and more I/O threads as you increase the number of partitions
Partitioning Tips (dependent on hardware)
<= 20M records per partitionHave seen acceptable performance up to 55M records
Partitions of 250MB or smaller where possibleHave seen acceptable performance up to 2GB
Fewer than 2000 partitions
© 2007 Microsoft Corporation. All rights reserved
Large Parent-Child DimensionsRedesign as denormalized table to improve performance
Linked measure groups on remote serversLinked measure groups on the same server perform well
Linked dimensionsAdditional querying overhead
Referenced dimensionsMaterialize to improve query performance
Real Time ROLAPUse separate partition to improve query performance
© 2007 Microsoft Corporation. All rights reserved
Who is SQL Customer Advisory Team (SQL CAT)
Overview of large AS projects
Lessons Learned
People and Infrastructure
Performance
Improving Processing Performance
Improving Query Performance
Performant MDX Tips
Scale up vs. Scale Out
© 2007 Microsoft Corporation. All rights reserved
Avoid assigning values like 0, Null, “N/A”, “-“ to cells that would remain empty otherwiseAvoid redundant Sum/Aggregate calculations in situations where default/normal cell value aggregation would doAvoid run-time checks that result in slow execution path
Case and IIF frequently used in this mannerUse SCOPE to reduce calculation space instead, or write such that one condition is written “null”
Prefer using static literal hierarchy and member references
Use Measures.Sales instead of Dimensions(0).Sales,
Avoid using StrToSet, StrToMember, StrToValuePerform string manipulation in AS stored procedures
© 2007 Microsoft Corporation. All rights reserved
Use Non_Empty_Behavior optimization hintUse instead of writing calculation expressions of the form Aggregate(NonEmptyCrossjoin(Descendants(…, Leaves) …).
Use an explicit measure nameUse instead of Measures.CurrentMember whenever possible.
With calculations like “expr1 * expr2”, ensure the expression sweeping the largest area/volume in the cube space is on the left side Replace simple “Measure1 + Measure2” calculations with computed columns in the DSV or in the SQL data source Instead of writing expressions like Sum(Customer.City.Members, Customer.Population.MemberValue), consider defining a separate measure group on the City table, with a Sum measure on the Population column
© 2007 Microsoft Corporation. All rights reserved
Use Exists FunctionExists function should be used where ever possible instead of filter on member properties.
Use Non Empty keyword and IsEmpty function to reduce cube spaceUse Minus over Filter for a single member
When filtering out a single member from the set use minus over filter functionAvoid:
filter({set},.Currentmember <> "UNKN")
Use :( {set} minus {&[UNKN] member})
Avoid Linked MemberUse multiple measure groups in single cube
© 2007 Microsoft Corporation. All rights reserved
Scale up to 64-bit, multiple processors, SAN, …
Scale up tips
Add more memory (4GB per core recommended)
More CPUs can help parallelize processing and complex queries
Partition your cubes effectively (see Partitioning Tips slide)
If the project reaches the physical limits of the hardware, scale out becomes the next logical solution
© 2007 Microsoft Corporation. All rights reserved
Can create multiple instances of AS on the same server
Can be difficult to manage
Requires the use of Windows Resource Manager
Frequently better to scale out to multiple smaller servers
Create a separate processing server and multiple query servers
Use multiple query servers to load balance queries
Use dedicated processing server
Use synchronize function in AS or copy directories using Robocopy (or similar) for best performance with large databases
© 2007 Microsoft Corporation. All rights reserved
References
BLOGS:
http://blogs.msdn.com/sqlcat
http://www.sqljunkies.com/WebLog/mosha
PROJECT REAL-Business Intelligence in Practice
Analysis Services Performance Guide
TechNet: Analysis Services for IT Professionals
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it
should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.