Power BI Premium Planning
and Deployment
Whitepaper
Summary: This document provides guidance and best practices for planning and deploying
Premium capacity for well-defined workloads.
Writers: Sirui Sun, Kasper de Jonge
Technical Reviewers: Orion Lee, Siva Harinath, Robert Bruckner, Josh Caplan, Cristian Petculescu,
Sergei Gundorov
Published: November 2017
Contents Introduction .................................................................................................................................................................... 3
Prerequisites ............................................................................................................................................ 3
Understanding the types of load .......................................................................................................... 3
Loading the model into memory ...................................................................................................... 3
Querying the data for reporting ....................................................................................................... 4
High-Level Testing Approach ............................................................................................................... 5
Step 1: Performing Load Testing on Premium .............................................................................................. 5
Acquiring dedicated capacity for testing ............................................................................................ 5
Load Testing the Premium Capacity .................................................................................................... 6
Load Testing Best Practices ................................................................................................................... 6
Multiple users ...................................................................................................................................... 6
Test user management and licensing implications ........................................................................ 6
Factoring in think time ....................................................................................................................... 6
DirectQuery and Live connection query limits ................................................................................ 7
Starting small to isolate components with high performance impact ........................................ 7
Extrapolating findings between different capacity node sizes – .................................................. 7
Step 2: Optimizing reports and datasets for peak performance .......................................................... 7
Use filters to limit report visuals to display only what’s needed .................................................. 7
Optimize your model ......................................................................................................................... 8
Understanding dashboards and query caches ............................................................................... 8
Deep-dive into query performance with SQL Profiler and Power BI Desktop ........................... 9
Deep-dive into memory utilization with Process Explorer and Power BI Desktop .................. 10
Step 3: Deploying and monitoring Premium in your organization ................................................. 11
Understanding the Premium capacity usage metrics .................................................................. 11
Understanding usage in your organization .................................................................................. 12
Scaling up beyond a P3 ................................................................................................................... 12
Understanding v-core pooling and scale-up ................................................................................ 13
Further Reading ............................................................................................................................................................ 13
Introduction This document provides guidance and best practices for planning and deploying Power BI
Premium capacity. The content is meant to be applicable to all types of Power BI Premium
deployments – from bespoke deployments serving small, high-importance audiences to
extremely large, scaled out deployments serving hundreds of thousands.
Prerequisites
This whitepaper is a 200-level document. It assumes general knowledge of what Premium is, and
a firm grasp of the basic Premium management concepts. Additional information on this can
also be found in the Premium whitepaper and the Premium documentation.
Understanding the types of load
To build a strong foundation of knowledge, it is important to first understand the most
important types of load we can have on a Premium node.
Loading the model into memory
In order for any interaction to happen on an imported dataset, it must be loaded into Premium
capacity’s in-memory engine. Let’s drill into the details of what happens when a model is loaded
into memory:
Moving the data from data source to Power BI
The first step is moving the data from the data source to Power BI, this doesn’t take any
significant memory or CPU, just bandwidth. Bandwidth can be potentially impacted by the data
source.
Execute transformations as defined in the Query Editor
All data that gets imported into Power BI through the query editor uses a special data mashup
engine (also known as the Power Query engine). During data refresh, this engine will execute
any transformations defined during development. This mashup engine utilizes your Premium
capacity to provide incredible flexibility and power – the more complex the operation, the more
time and system resources required during refresh.
Tip: to conserve Premium resources, consider delegating functions to the underlying data source. For
example, when using SQL Server as data source, many operations can be translated into a SQL query,
causing the SQL server to perform the heavy lifting. Conversely, but when you import data from a JSON file
all operations need to happen in the Power Query engine, thereby taxing your capacity’s CPU and memory.
Loading the data into memory
Now that the data is in the right shape it gets moved into the Power BI in-memory engine. This
new data gets loaded into memory alongside the existing data that it will replace. This ensure an
uninterrupted experience for users who are interacting with the existing data. At that moment in
time you might need up to 2 ½ times the amount of memory for that single model. As soon as
the data is loaded into the model the memory will be cleared again.
Ready for Analysis
Now that these steps have happened the data is ready be analyzed.
Note: During data refresh of imported datasets, Power BI updates your data models with the latest data
from the underlying data sources. Data refresh can be triggered either manually by the user, or on a set
schedule. During data refresh, all the above steps are taken on the model. As you can imagine, refresh places
significant load on your capacity, especially when multiple models are being refreshed at once.
Note: Not all data in Power BI is imported, when using live connect (connecting to SSAS) or Direct Query
the data is retrieved during analytics from the data source directly so in those cases there is much less
memory or CPU used. Of course, the data source will take the load instead and you will have to move the
testing there.
Querying the data for reporting
Once the model is loaded into memory, performing reporting on the data will also tax your
capacity’s performance. Power BI has two types of visualizations, reports and dashboards, both
with a different impact on performance.
Dashboards
Dashboards are designed to be the single pane of glass, where you can see all the metrics that
matter to you at-a-glance. To facilitate high performance, the data displayed by dashboards are
cached by Power BI and are updated on a schedule. This query cache is shared between all users
of the dataset unless Row-Level Security is used, then a cache per user will be created, resulting
in much more load on the system. Users connecting to the dashboard to look at the data will
incur minimal load on the system, since the data is retrieved from cache.
Updating the query cache will result in some load on the system and its data sources at the
update intervals. For import datasets, this cache is updated every time scheduled refresh takes
place. For DirectQuery or Live connection, the cache update schedule can be customized,
ranging from once every 15 minutes in a day to once every week.
Note: the one exception to the above is live report tiles pinned to dashboards. Live report tiles do not fetch
data from the cache, but instead behave like reports, as outlined below.
Reports
Reports generate queries to the dataset when opened and on interaction with the visuals
(clicking a filter, slicer, etc.). These queries then are sent to the underlying data source. When
using an imported dataset, this results in queries placing CPU and memory load on the capacity.
The magnitude of the load depends on number of visuals/filters on the report, the amount of
data that it needs to query, and the types of calculations it needs to perform. A large dataset
with complex calculation causes more load then simple aggregations. The reports themselves
contain some client-side caching as well – for example, if the user triggers the same query twice
per session, the second time it will be served from the client-side cache. It is also important to
recognize that the client (the browser) also performs some actions here in rendering the visuals
using JavaScript.
Note: reports built on live connect datasets also have their default visual states cached during scheduled
cache refresh. This is again to improve load performance. Note that cross-highlighting, filtering, slicing, etc.
will still generate queries against the underlying data source.
Of course, as a Premium customer, you’re expecting many users for your reports and
dashboards. This makes the load calculation much more interesting. In general, the expectation
is that most users will look at the data at a glance with the dashboards (not causing much load)
and every now or then users will dive into the actual report causing more load. In practice, as
more users use a report, they each consume incremental amounts of memory and CPU. Your
mileage might vary based on the calculations used, using simple aggregations causes much less
load than complex iterative DAX functions.
High-Level Testing Approach
Now that we know what causes our load on our system let’s see how we can test it. The
following outlines our high-level approach. This paper will dive into each in detail.
1) Perform automated load testing on your content to realistically simulate desired load
and observe performance of Premium under those circumstances.
2) Optimize your reports and datasets based on the findings from (1)
3) Deploying Premium based on the numbers, and monitoring Premium performance after
deployment and adjusting load accordingly
FAQ: Can’t I just plug my numbers into a formula which tells me how much I need? When it comes to
determining the performance of reports in Premium capacity, there are numerous factors, each of which can
have a huge impact on report performance – e.g.: model size, model complexity, report visual complexity
and quantity, viewer behaviors, cache hit rates, etc. As a result, it is exceedingly difficult to determine the
amount of capacity needed with high levels of accuracy without load testing.
Step 1: Performing Load Testing on Premium The first step will be setting up a dedicated capacity node, moving your desired reports and
datasets onto the node, and then performing automated load testing on said node to determine
the performance characteristics under load.
Acquiring dedicated capacity for testing
The most cost-effective and flexible method for obtaining Premium capacity is to purchase the
month-to-month P* SKUs in the Office 365 portal. This allows you to try the EM* or P* SKU
without an annual commitment. Note that when you purchase a P* SKU, you get a
corresponding number of v-cores that you can then spend to set up any type of capacity – e.g. if
you purchase two P1s, then you’ll have 16 v-cores, which you can use to set up a P2. You may
move between these capacities at no cost, if you have the requisite number of v-cores. This will
allow you to compare the different types of capacities during your month period at no
additional cost.
You may also wish to contact your Microsoft account team for assistance throughout this
process. They may have additional tools and options available.
Load Testing the Premium Capacity
Once the content you’d like to test is hosted in Premium capacity, there many ways to place
simulated load on the capacity.
• In the simplest form, you can simply recruit many users to simultaneously use the service,
making requests against a report
• You could also use HTTP traffic tool like Fiddler to record and playback traffic to the
Power BI service
• Visual Studio Enterprise provides a load testing suite
• You could use a more sophisticated load testing tool such as Selenium to automate a
load test involving numerous users making various requests to the Power BI service
Warning: automated load testing must only be performed on Premium capacity. Automated
load testing should never be performed on shared capacity. The Power BI team already performs
rigorous load planning and testing in shared capacity to ensure the best experience for all users.
Load Testing Best Practices
Multiple users, multiple datasets
Use multiple AAD users as part of load-testing, instead of just rapidly making requests with a
single user. There are safeguards in the service which prevent individual user accounts from
making an unrealistic number of requests over a short period of time. Such limits are nearly
impossible to reach for individual users using the service but may be reached if load testing only
uses one AAD user.
In addition, the need for using multiple AAD users is doubly true when you have row-level
security set up on your dataset, which would cause different users to see different content.
Using just a single user in this instance may drastically misrepresent performance – e.g. having
queries for just one user will cause many more cache hits than with different users.
If you are planning on hosting multiple active dashboards, reports and datasets at once in your
Premium capacity, then the above is equally true for datasets – i.e. the performance
characteristics of your capacity can vary greatly given simultaneous usage of different datasets.
Test user management and licensing implications
To follow the above best practice, organizations may need to create many dummy test users. To
automate this process, Office 365 exposes APIs for programmatically creating, editing and
deleting users. Read more here.
Factoring in think time
In setting your automated load tests, you will want to ensure that you are putting the Premium
capacity under realistic loads. One common pitfall for load testers is not giving users enough
“think time” between actions – e.g. having their simulated users clicking through the reports too
rapidly. In practice, we find that users spend an average of 5-10 seconds thinking following
actions such as slicing or cross-highlighting.
DirectQuery and Live connection query limits
All dedicated capacities have query throughput limits when making DirectQuery or Live
connection queries to external data sources. – e.g. 30 queries per second for a P1. These queries
are enforced via a leaky bucket – e.g. if 40 queries are received in a single second, 30 will
execute, the rest will wait a second and execute.
Notice that currently, the throttles for these DirectQuery and Live connection query limits are
still being tuned. If you’re load testing with DirectQuery/live connection datasets, you may see
higher query rates than what is included with your SKU. You should not count on this increased
query volume for production workloads. Instead, you should monitor your data source and see
how many queries it is receiving per second and use that to gauge the SKU of capacity that is
required.
Starting small to isolate components with high performance impact
Not all visuals are created equal when it comes to performance impact – far from it. Often,
certain visuals will be orders of magnitude slower and more expensive than others, due to the
queries that they send to the Premium capacity. Such visuals can have a domino effect on
performance – hogging up CPU time and memory to slow down successive report loads, which
means expensive visual on that later load goes even slower, etc.
To isolate such problematic visuals, start with small, simple reports to establish a baseline of best
case performance. From there, incrementally add visuals and observe the impact in performance.
This will allow you to isolate the performance impact of individual visuals and pick out
candidates for optimization. To get further information on visual complexity and CPU cost, see
the section below on using the SQL Profiler.
Extrapolating findings between different capacity node sizes
It may be tempting to take load test findings on a particular node and extrapolate it linearly to
larger node sizes. However, given the complexities of the system at play, there are a number of
factors which may cause performance to not scale linearly between nodes – e.g. cache hit rates,
resource contention, etc. We’ve seen cases where doubling hardware results in better than
double performance. We’ve also seen the exact opposite. The only way to know for sure is to
test.
Step 2: Optimizing reports and datasets for peak performance There are a number of steps you can take to optimize your content for better performance – be
it in Premium or shared capacity. The following section walks through some best practices.
Use filters to limit report visuals to display only what’s needed
A major customer preparing for a Premium launch noticed that they were encountering memory
issues. The culprit: this organization’s primary report had a table which featured 100M+ rows of
data. End users were to use slicers on the page to get to the rows they wanted, which typically
numbered only in the dozens.
However, the table by default shows up unfiltered, - i.e. all 100M+ rows. The data for these rows
needs to be loaded into memory, uncompressed, at every refresh. This created huge memory
loads. The solution was to reduce the max number of items that the table displayed using the
“Top N” filter. The max item was still larger than what users would need – e.g. 10,000. As a result,
the end user experience was unchanged, but memory utilization of the report dropped multiple
orders of magnitude.
A similar approach to the above is strongly suggested for all visuals on your reports. Ask
yourself: is all the data in this visual needed? Are there ways to filter down the amount of data
shown in the visual with minimal impact to the end user experience? Note that tables in
particular can be very expensive.
Optimize your model
The same mindset above should be applied to the data model. Some best practices:
• Tables or columns that are unused should be removed if possible
• Avoid distinct counts on fields with high cardinality – i.e. millions of distinct values.
• Take steps to avoid fields with unnecessary precision and high cardinality. For example,
you could split highly unique datetime values into separate columns – e.g. month, year,
date, etc. Or, where possible, use rounding on high-precision fields to decrease
cardinality – (e.g. 13.29889 -> 13.3).
• Use integers instead of strings, where possible
• Be wary of DAX functions which need to test every row in a table – e.g. RANKX – in the
worst case, these functions can exponentially increase run-time and memory
requirements given linear increases in table size.
• When connecting to data sources via DirectQuery, consider indexing columns that are
commonly filtered or sliced again – this will greatly improve report responsiveness
For more guidance on optimizing data sources for DirectQuery, please consult the DirectQuery
whitepaper.
Understanding dashboards and query caches
Visuals pinned to dashboards are served by the query cache when the dashboard is loaded.
Conversely, when visiting a report, the queries are made on-the-fly to the back-end cores of
Premium Capacity.
Note: when you pin live report tiles a dashboard, they are not served from the query cache – instead, they
behave like reports, and make queries to back-end cores on the fly.
As the name suggests, retrieving the data from the query cache is much less taxing on the
Premium capacity, making queries to the back-end cores. In addition, the performance is very
dependable. As a result, dashboard views tend to be much less taxing on your capacity than
report views, while at the same time providing consistent performance.
One way to take advantage of this functionality is to have dashboards be the first landing page
for your users. Pin often used and highly requested visuals to the dashboards. In this way,
dashboards become a valuable “first line of defense” which provide consistent performance with
less load on the capacity. Users can still click through to the report to dig into the details.
Deep-dive into query performance with SQL Profiler and Power BI Desktop
For a deeper dive into which visuals are taking up the most time and resources, you can connect
SQL Profiler to Power BI Desktop to get all full view of query performance. Instructions as
follows:
1. Install SQL Server Profiler and run Power BI Desktop
SQL Server Profiler is available as part of SQL Server Management Studio.
2. Determine the port being used by Power BI Desktop
Run the command prompt or PowerShell with administrator privileges, and use netstat
to find the port that Power BI Desktop is using for analysis:
> netstat -b -n
The output should be a list of applications and their open ports – e.g.
TCP [::1]:55786 [::1]:55830 ESTABLISHED
[msmdsrv.exe]
Look for the port used by msmdsrv.exe and write it for later use. In this case, you could
use port 55786.
3. Connect SQL Server Profiler to Power BI Desktop
• Start SQL Server Profiler from the Start Menu
• File > New Trace
• Server Type: Analysis Services
• Server name: localhost:[port number found above]
• At the next screen, select “Run”
• Now, the SQL Profiler is live, and actively profiling the queries that Power BI Desktop
is sending
• As queries are executed, you can see their respective durations and CPU times –
using this information, you can determine which queries are the bottlenecks
Through the SQL Profiler, you can identify the queries which are taking up the longest CPU time,
which in turn are likely the performance bottlenecks. The visuals which execute those queries
should be then be a focal point of continued optimization.
Deep-dive into memory utilization with Process Explorer and Power BI Desktop
For a deeper dive into the memory utilization of your reports and datasets, you can use the
Process Explorer in conjunction with Power BI Desktop. This approach will allow you to closely
approximate the memory demands of your content once it is running in Premium capacity.
Step 1: download SysInternals Process Explorer and run Power BI Desktop
Step 2: get Power BI Desktop performance statistics
• Open Process Explorer
• Find and double-click on PBIDesktop.exe
• In the Properties dialogue box that appears, select the “Performance Graph”
This page will describe the memory and CPU demands of Power BI Desktop in real-time. This
provides a create tool for understanding memory usage of your report:
• In turn, you can load up a report, and see the effect of loading the report in memory.
This corresponds to a user viewing the report in Power BI Premium, and loading that
report into memory
• You can trigger a refresh in Power BI Desktop (notice the large spike in memory
utilization!). This corresponds to data refresh in the service
• Notice that cross-highlighting, filtering and slicing also cause CPU and memory
utilization.
Through Process Explorer, you can build an understanding of the memory demands of your
models and use that to extrapolate the amount of memory needed in your Premium capacity.
Step 3: Deploying and monitoring Premium in your
organization This section describes best practices for deploying your Premium capacity, monitoring its
success after launch, and taking steps to scale up or down as needed.
Understanding the Premium capacity usage metrics
The Premium capacity admin portal provides three gauges which indicate the load placed on
your capacity in the past week. The gauges all work on the hourly time windows, and they
indicate how many hours in the past week the corresponding metric was above 80% utilization,
which would point to degraded end user experience.
CPU: represents CPU utilization. Actions that tax the CPU include: viewing and clicking through
reports and refreshing models – especially those are imported into the service. During periods of
high CPU usage, users may experience poor performance on reports – e.g. reports taking longer
to load or to update after a cross-highlight or filter.
Memory thrashing: represents the memory pressure on the Premium capacity. The Premium
capacity opportunistically keeps models in memory to improve load performance. Therefore,
measuring memory utilization (e.g. 24 out of 25 GB used) would be misleading. Instead, we use
memory thrashing, which represents a measure of how often models in memory are evicted to
make room for new models. Actions that tax memory include: scheduled refresh jobs (typically
the single heaviest contributor to memory pressure) and users viewing many different models in
quick succession. During periods of high memory pressure, end users may experience: long wait
times for reports to first display data, and scheduled refresh jobs taking much longer than usual,
as they wait in a queue for the required memory allocation.
DirectQuery: represents the usage against DirectQuery/live connection throttles – e.g. 30
queries per second for P1. This is of course taxed by viewers accessing reports which are built on
DirectQuery or live connection datasets. During periods of high utilization, the throttle will limit
the number of DirectQuery/live connection queries that can be executed over any given second.
End users may experience longer load times for reports.
Understanding usage in your organization
Audit Logs
To further understand usage of Power BI and Power BI Premium in your organization, it is
strongly recommended that you enable Power BI audit logs. Once audit logging is enabled, logs
will be created and stored for significant actions in the Power BI service, including creation,
editing and viewing of dashboards and reports.
This video shows how you can bring the Power BI audit logs into Power BI in order to build a
report which provides a 360-degree view of Power BI usage in your organization. Through these
means, you can also achieve a detailed understanding of usage patterns for content in your
Premium capacity – e.g. which Premium content is viewed the most, and at what times and by
whom. This can be a powerful tool for further understanding your Premium deployment. For
example, you can determine what hours place peak load on your capacity and reach out to users
who were accessing content during that time to understand if they encountered performance
issues.
Note that the audit logs can also serve as an effective tool for chargeback scenarios where
multiple departments share a single Premium capacity and would like to contribute to the
payment proportional to their usage.
Usage Metrics for Content Creators
The usage metrics feature presents a subset of the data that audit log provide – i.e. which users
are accessing which dashboard/reports on which days. They can be further customized to
understand usage for an entire workspace. While they provide less information than audit logs,
they do not require prior setup, and can be accessed simply by selecting the “usage metrics”
option on any dashboard/report.
Power BI REST APIs
The Power BI REST APIs can be used to automate information tasks for the purposes of
understanding Power BI usage in your organization. This is especially helpful when the
information is not yet available in audit logs or usage metrics. For example, you can use the APIs
to iterate through the different datasets a Premium workspace, and then to enumerate their
refresh histories, in order to understand all the refreshes happening in a capacity over a period
of time.
Scaling up beyond a P3 node
Workloads with extremely large expected audiences may require a set of content be hosted in
hardware beyond what is offered in the largest Premium node – currently the P3 node with 32 v-
cores and 100 GB of memory. In those cases, the content in question should be duplicated
across multiple workspaces. In turn, those workspaces should be assigned to different Premium
capacity nodes. Duplication of content can be automated via the Power BI REST APIs.
Understanding v-core pooling and scale-up
Through v-core pooling and one-click scale-up, Power BI Premium offers significant deployment
flexibility. Understanding how v-core pooling works is essential to effective administration of
Power BI Premium.
When you purchase any SKU of Power BI Premium your tenant gains access to the
corresponding number of v-cores. For example, if you purchase a P3 node, you will receive
access to 32 v-cores.
From there, you have the flexibility to provision your v-cores into one or more nodes of
Premium capacity. In the previous example, with 32 v-cores, the customer could provision four
P1s (4 nodes * 8 cores/node), two P2s (2 nodes * 16 cores/node), or one P3. They are also free to
move between node configurations, so long as they have the v-cores available.
Once set up, a Premium capacity can be scaled up on-demand, so long as the requisite number
of cores are available. For example, you have purchased 8 v-cores, and set up a P1 (also 8 v-
cores). From there, you deploy. Later, you realize more capacity is needed. You purchase 8 more
v-cores for a total of 16 v-cores. Now, you can scale up your P1 to a P2.
Further Reading • Power BI Premium whitepaper – an introduction to Power BI Premium
• Power BI Premium documentation – all-up documentation on Power BI functionality
• Power BI Enterprise Deployment Whitepaper – all-around guidance on large-scale
Power BI deployments
• Presentation: building fast and reliable reports in Power BI
• Presentation: Power BI Enterprise Deployments