2 olap operaciones

Post on 11-May-2015

770 views 1 download

Tags:

description

Operaciones OLAP

transcript

DATA WAREHOUSING Multi Dimensional OLAP

2

Usando el

DW

3

Example of two-dimensional query. ▪ What is the total revenue generated by property sales in

each city, in each quarter of 2004?’

Choice of representation is based on types of queries end-user may ask.

4

5

Compare representation - three-field relational table versus two-

dimensional matrix.

Example of three-dimensional query.

‘What is the total revenue generated by property sales for each type of property (Flat or House) in each city, in each quarter of 2004?’

6

7

Data Cube

Compare representation - four-field relational table versus three-dimensional cube.

A subset of highly interrelated data that is organized to allow users to combine any attributes in a cube (e.g., stores, products, customers, suppliers) with any metrics in the cube (e.g., sales, profit, units, age) to create various two-dimensional views, or slices, that can be displayed on a computer screen

8

Cube represents data as cells in an array.

Relational table only represents multi-dimensional data in two dimensions.

9

Use multi-dimensional structures to store data and relationships between data.

Multi-dimensional structures are best visualized as cubes of data, and cubes within cubes of data. Each side of a cube is a dimension.

A cube can be expanded to include other dimensions.

10

A cube supports matrix arithmetic.

Multi-dimensional query response time depends on how many cells have to be added ‘on the fly’.

As number of dimensions increases, number of the cube’s cells increases exponentially.

11

However, majority of multi-dimensional queries use summarized, high-level data.

Solution is to pre-aggregate (consolidate) all logical subtotals and totals along all dimensions.

12

Pre-aggregation is valuable, as typical dimensions are hierarchical in nature.

(e.g. Time dimension hierarchy - years, quarters, months, weeks, and days)

Predefined hierarchy allows logical pre-

aggregation and, conversely, allows for a logical ‘drill-down’.

13

Supports common analytical operations

Consolidation

Drill-down

Slicing and dicing

Pivoting

14

Consolidation - aggregation of data such as simple ‘roll-ups’ or complex expressions involving inter-related data.

Drill-Down - is the reverse of consolidation and involves displaying the detailed data that comprises the consolidated data. The investigation of information in detail (e.g.,

finding not only total sales but also sales by region, by product, or by salesperson). Finding the detailed sources.

15

16

Slicing and Dicing: refers to the ability to look at the data from different viewpoints.

dice: to cut into small cubes

slice: A section of an cube selected by specifying its lower and

upper limits

17

dice(color)

slice(color,mes)

Pivoting:

Pivot deals with presentation

Choose some dimensions X1, . . . ,Xi to appear on x and some dims Y1, . . . ,Yj to appear on y.

18

Can store data in a compressed form by dynamically selecting physical storage organizations and compression techniques that maximize space utilization.

Dense data (that is, data that exists for a high percentage of cells) can be stored separately from sparse data (that is, a significant percentage of cells are empty).

19

Ability to omit empty or repetitive cells can greatly reduce the size of the cube and the amount of processing.

Allows analysis of exceptionally large amounts of data.

20

In summary, pre-aggregation, dimensional hierarchy, and sparse data management can significantly reduce the size of the cube and the need to calculate values ‘on-the-fly’.

Removes need for multi-table joins and provides quick and direct access to arrays of data, thus significantly speeding up execution of multi-dimensional queries.

21

Efraim Turban. Business Intelligence. Prentice Hall.2008.