Date post: | 18-Nov-2015 |

Category: | ## Documents |

View: | 239 times |

Download: | 3 times |

Share this document with a friend

Description:

QlikView Dimensionality

Transcript:

All Places > Qlik Design Blog > Authors > Henric Cronstrm >a

Qlik Design Blog 102 Posts authored by: Henric Cronstrm 1 2 3 a

A chart in QlikView or in Qlik Sense has Dimensions and Measures. What these are is described in Dimensions and Measures. This post is aboutcharts with multiple dimensions and/or multiple measures and your options when designing such charts.

In a simple chart with one dimension and one measure, the number of data points is determined by the number of possible values in the dimension. Forexample, a bar chart with Month as dimension typically has twelve bars one per month.

If you want to add complexity to your chart, you can choose between adding a dimension and adding a measure. Whichever you do, the chart will increaseits rank or dimensionality and change appearance.

Below you have two bar charts: The left chart has two dimensions and one measure, while the right chart has one dimension and three measures. Yet,they are almost identical.

The left chart has Sum(Amount) as measure, while the right has Sum({$} Amount) as first measure, and similar expressions for the additionaltwo measures.

The reason why they look identical is that they have the same dimensionality: An array of measures can be regarded as a virtual dimension, and if so, bothcharts have two dimensions, i.e. a dimensionality of two.

This property is not unique for bar charts. Most charts can be altered this way, e.g. pie charts:

Notice that the pie chart to the right has zero dimensions. It is a dimensionless chart with several measures. Several chart types can display relevantinformation without having a dimension: e.g. the Pie chart, the Bar chart, the Funnel chart, the Radar chart, the Pivot table and the Straight table. Try it,and youll see.

There are some charts that dont fit the above description though. First of all, the Gauge is a dimensionless chart that always has zero as dimensionality.

Secondly, the Trellis chart is just a container for multiples of another chart type. By using a Trellis, you effectively can add one or two dimensions. Forexample, you can add a dimension to a Gauge using a Trellis chart:

Chart DimensionalityPosted by Henric Cronstrm Jan 27, 2015a

Page 1 of 22Qlik Design Blog ... | Qlik Community

12-Feb-15http://community.qlik.com/blogs/qlikviewdesignblog/authors/hic

Click

to bu

y NOW

!PD

F-XChange

ww

w.tracker-software

.com Cli

ck to

buy N

OW!

PDF-XChange

ww

w.tracker-software

.com

http://www.tracker-software.com/buy-nowhttp://www.tracker-software.com/buy-now

Further, the Scatter chart is different from other charts in that it always needs one dimension to define the number of data points, and two measures todefine the coordinates. The dimension cannot be replaced by an array of measures.

With the above knowledge, it is easier to describe the limits of different chart types:

The first number is the largest dimensionality for which the chart makes sense. However, some charts can be made to display a higher dimensionality(number to the right), but it is rarely easy to understand such a chart, so I dont recommend it.

Finally, the conclusion from the above is that you have a choice of displaying the last dimension either as dimension or as an array of measures. If youchoose a dimension, then you have the advantage that the user can select in this dimension by clicking in the chart. But if you instead choose an array ofmeasures, you have a greater flexibility for customizing the measures. You can for instance add a measure which is different than the first ones; e.g. inaddition to Sales 2014 and Sales 2015 you can display the relative change.

With this, I hope that you have some new ideas for visualizations.

HIC

2107 Views 6 Comments Permalink Tags: dimension, chart, dimensionality

In the QlikCommunity forum I have often seen people claim that you should minimize the number of hops in your Qlik data model in order to get the bestperformance.

I claim that this recommendation is not (always) correct.

In most cases, you do not need to minimize the number of hops since it affects performance only marginally. This post will try to explain when an additionaltable significantly will affect performance and when it will not.

The problem is which data model to choose:

A Myth about the Number of HopsPosted by Henric Cronstrm Jan 20, 2015a

Page 2 of 22Qlik Design Blog ... | Qlik Community

12-Feb-15http://community.qlik.com/blogs/qlikviewdesignblog/authors/hic

Click

to bu

y NOW

!PD

F-XChange

ww

w.tracker-software

.com Cli

ck to

buy N

OW!

PDF-XChange

ww

w.tracker-software

.com

http://www.tracker-software.com/buy-nowhttp://www.tracker-software.com/buy-now

The question is: Should you normalize and have many tables, with several hops between the dimension table and the fact table? Or should you join thetables to remove hops?

So, I ran a test where I measured the calculation time of a pivot table calculating a simple sum in a large fact table and using a low-cardinality dimension,while varying the number of hops between the two. The graph below shows the result. I ran two series of tests, one where the cardinality of thedimensional tables changed with a factor 10 for each table; and one where it changed with a factor 2.

You can clearly see that the performance is not affected at all by the number of hops at least not between 0 and 3 hops.

By 4 hops, the calculation time in the 10x series however starts to increase slightly and by 5 hops it has increased a lot. But this is not due to the numberof hops. Instead, it is the result of the primary dimension table (the dim table closest to the fact table) getting large: By 5 hops it has 100.000 records andcan no longer be regarded as a small table.

To show this, I made a second test: I measured the calculation time of the same pivot table using a fix 3-table data model, varying the number of records inthe intermediate table, but keeping the sizes of the other tables.

In real life, this structure would correspond to a part of a more complex data model, e.g.

Facts - Products - Product Groups

Order Lines - Order Headers - Customers

The result of my measurement can be seen in the red bars below:

Page 3 of 22Qlik Design Blog ... | Qlik Community

12-Feb-15http://community.qlik.com/blogs/qlikviewdesignblog/authors/hic

Click

to bu

y NOW

!PD

F-XChange

ww

w.tracker-software

.com Cli

ck to

buy N

OW!

PDF-XChange

ww

w.tracker-software

.com

http://www.tracker-software.com/buy-nowhttp://www.tracker-software.com/buy-now

The graph confirms that the size of the intermediate table is a sensitive point: If it has 10.000 records or less, its existence hardly affects performance. Butif it is larger, you get a performance hit.

I also measured the calculation times after joining the intermediate table, first to the left with the fact table, and then to the right with the dimension table, tosee if the calculation times decreased (blue and green bars). You can see that joining tables with 10.000 records or less, does not change theperformance. But if you have larger tables, a join with the fact table may be a good idea.

Conclusions:

The number of hops does not always cause significant performance problems in the chart calculation. But a large intermediate table will.

If you have both a primary and a secondary dimension (e.g. Products and Product Groups), you should probably not join them. Leave the data modelas a snowflake.

If you have the facts in two large tables (e.g. Order Lines and Order Headers), you should probably join them into one common transaction table.

HIC

PS. A couple of disclaimers:

1. The above study only concerns the chart calculation time - which usually is the main part of the response time.

2. If the expression inside your aggregation function contains fields from different tables, none of the above is true.

3. Your data is different than mine. You may get slightly different results.

3851 Views 36 Comments Permalink Tags: star_schema, data_modeling, snowflake_schema, number_of_hops, primary_dimension

One Qlik function that occasionally causes confusion is the Date function. I have often seen errors caused by an incorrect usage of it, so today I will tryto explain what the function does and what it does not.

Interpretation vs FormattingThe first thing you should be aware of is the difference between Date#() and Date(). The first is an Interpretation function and the second is a Formattingfunction.

Interpretation functions use the textual value of the input, and convert this to a number.

Formatting functions use the numeric value of the input, and convert this to a text.

In both cases, the output is a dual, i.e. it has both a textual value and a numeric value. The textual value is displayed, whereas the numeric value is usedfor all numerical calculations and sorting.

The table below shows how to use the interpretation function Date#(). Note that the format code must match the input parameter.

This is very different from the formatting function Date(). Next table shows how to use this function. Note that the format code matches the format of theoutput text.

The Date FunctionPosted by Henric Cronstrm Dec 2, 2014a

Page 4 of 22Qlik Design Blog ... | Qlik Community

12-Feb-15http://community.qlik.com/blogs/qlikviewdesignblog/authors/hic

Click

to bu

y

Popular Tags:

of 22

Embed Size (px)

Recommended