Dimensional Modeling 101 - Purdue University Give you an overview of: • The basic concepts of...

Post on 16-Mar-2018

216 views 2 download

transcript

Boiler Insight Data Mart Training – Part 1

Dimensional Modeling

101

Objective

Give you an overview of:

• The basic concepts of Dimensional Modeling

•What dimensional models look like in Cognos

• The ways dimensionally modeled data is accessed

• How historical data is handled

• The eight “Key Concepts” of dimensionally modeled data

to give you a good foundation when you attend the data

training

2

It’s all Ralph’s Fault

• Widely regarded as one of the original architects of data warehousing

• Developed a methodology for building the data marts in a data warehouse

• Kimball’s methodology results in a structure for the data that is easy to use and very fast to access

• When following the Kimball methodology you are doing dimensional modeling

Ralph Kimball

3

So what?

All this is great, but why do we care about Ralph and his

methodology?

Kimball’s methodology results in a structure for the data that is easy to use

and very fast to access.

4

Kimball Methodology

Four-Step Dimensional Design Process for a Star Schema as part of

a Data Mart:

1. Select the subject area.

2. Declare the grain of the business process.

3. Choose the dimensions that apply to each fact table row and

define their attributes.

4. Identify the numeric facts that will populate each fact table

row.

5

Kimball Methodology

Four-Step Dimensional Design Process for a Star Schema as part of

a Data Mart:

1. Select the subject area.

2. Declare the grain of the business process.

3. Choose the dimensions that apply to each fact table row and

define their attributes.

4. Identify the numeric facts that will populate each fact table

row.

6

Subject Area & Data Mart

• Subject Area – The data related to one area of the business

• Data Mart – A group of data from a given subject area

specifically structured for access and reporting

The first subject area selected for the Boiler Insight (BI) Data Warehouse

was Human Resources (HR).

So, we built an HR Data Mart.

7

Star Schema (a.k.a. Star)

• A Star Schema is the structure of the data in the data mart

• It is the result of executing the Kimball methodology (doing

dimensional modeling)

• Each Star Schema holds data for a specific subset of the Data

Mart

Employee Actions

Employee Appointment

Employee Leave

Compensation Plan

Payroll Charge

Employee Quota

Examples of star schemas in the HR Data Mart:

8

Star Schema

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

9

Facts (a.k.a. Measures)

• Facts are the numbers that result from transactions being

performed

Examples:

The amount we pay each employee at each payroll

The number of days the employee is on leave

The count of each employee action that occurs

• Facts are additive (that is, they can be summed)

10

Fact Table

• Facts are grouped into the Fact Table

• Each row usually represents one specific transaction

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

11

Dimensions

• Dimensions provide the means to filter, group, and label the facts.

Examples:

Employee

Date

Organization

Cost Center

• Dimensions are often used in “by” phrases:

“I want total payroll by Org Unit, by Month.”

Remember: FGL (“figgle”)

12

Dimensions

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

• There are typically 5-15 dimensions in a Star Schema

• Dimensions are made up of Attributes

13

Attributes

• Attributes are the actual fields in each dimension

Examples:

Employee Dimension:

PERNR

Name

Employee Group

Employment Status

Date Dimension:

Date

Month

Year

Quarter

Fiscal Period

Cost Center Dimension:

Cost Center

Department

Person Responsible

Locked Flag

• There are special attributes that are in the Fact table called Degenerate Dimension attributes

• They are like miscellaneous attributes that don’t fit cleanly in any dimension

• Treat them like any other attribute

14

Grain

• A statement that defines the most atomic (smallest) level for the records in a dimension or fact table.

Examples:

Employee Dimension – One record for each employee in a position

Payroll Charge Fact Table – One record per employee per wage type per payroll period

• The grain for a fact table is often put in terms of the dimensions.

15

Key Concept #1: Understand the Basic Building Blocks

• Data Marts are made up of star schemas

• Stars have one fact table and many dimensions

• The fact table contains facts and links to the dimensions

• Facts are numbers and are additive

• The dimensions contain attributes

• Dimensions (or rather their attributes) are used to filter,

group, and label (“figgle”)

16

HR Data Mart HR Data Mart HR Data Mart

Putting it all together

SAP

Boiler Insight Data Warehouse

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

17

HR Data Mart HR Data Mart HR Data Mart

Putting it all together

Boiler Insight Data Warehouse

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

Fact

Table

Dimension 2 Dimension 5

Dimension 3 Dimension 4

Dimension 1

Cognos

18

Accessing a Data Mart

• There are three primary ways to access data mart:

1. Query one dimension without using the whole star

2. Query a single star

3. Query multiple stars

20

Key Concept #2: Know Which Access You Are Using

• You will always be using one of the three methods:

1. Single dimension

2. Single Star

3. Multiple Stars

• As soon as you are using two dimensions you have moved

from access method 1 to 2 and you are using the relations

to the fact table.

21

Payroll Charge Star

Payroll

Charge

Fact

Table

Posting

Date GL Account

Funds

Center

Fund

Sponsored

Program

Cost

Center

FI Document

Type

Organization

Position

Payroll Run

Info

Wage Type Payroll

Period Grant

Functional

Area

Order

Employee

Commitment

Item Job

22

Role Playing Dimensions

Payroll

Charge

Fact

Table

Fund

HR Fund

Role Playing Dimension

23

Simplified Payroll Charge Star

Payroll

Charge

Fact

Table

Date

Cost

Center Employee

24

Attributes

Employee Dimension:

PERNR

Name

Employee Group

Employment Status

Date Dimension:

Date

Month

Year

Quarter

Fiscal Period

Cost Center Dimension:

Cost Center

Department

Person Responsible

Locked Flag

25

Employee Dimension

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/Prof Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/Prof Active

90070013 Bill Service Active

26

Date Dimension

Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

27

Cost Center Dimension

Cost Center Department Person

Responsible

Locked

Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

28

Payroll Charge Star

Payroll

Charge

Fact

Table

Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Date

Employee Cost Center PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/

Prof

Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/

Prof

Active

90070013 Bill Service Active

29

Payroll Charge Fact Table

Employee Date Cost Center Payroll

Amount

90025609 7/30/12 4008007002 $750

90045001 7/30/12 4090023001 $934

90033276 7/30/12 4008007003 $235

90025640 7/30/12 4008007003 $1,023

90070013 7/30/12 4008007003 $125

90033276 7/31/12 4008007003 $45

90025609 8/2/12 4008007002 $750

90045001 8/2/12 4090023001 $934

90025640 8/2/12 4008007003 $1,023

90070013 8/2/12 4008007003 $125

Payroll run

Paid

Everyone

Off-cycle

Payroll to

fix

problem

for John John Quits

Payroll run

Paid

Everyone

30

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/

Prof

Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/

Prof

Active

90070013 Bill Service Active

Payroll Charge Star Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Employee Date Cost Center Payroll

Amount

90025609 7/30/12 4008007002 $750

90045001 7/30/12 4090023001 $934

90033276 7/30/12 4008007003 $235

90025640 7/30/12 4008007003 $1,023

90070013 7/30/12 4008007003 $125

90033276 7/31/12 4008007003 $45

90025609 8/2/12 4008007002 $750

90045001 8/2/12 4090023001 $934

90025640 8/2/12 4008007003 $1,023

90070013 8/2/12 4008007003 $125

Date

Employee Cost Center

31

Accessing a Dimension

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/Prof Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/Prof Active

90070013 Bill Service Active

Question:

Is John an active

employee?

32

Accessing a Star Schema Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Employee Date Cost Center Payroll

Amount

90025609 7/30/12 4008007002 $750

90045001 7/30/12 4090023001 $934

90033276 7/30/12 4008007003 $235

90025640 7/30/12 4008007003 $1,023

90070013 7/30/12 4008007003 $125

90033276 7/31/12 4008007003 $45

90025609 8/2/12 4008007002 $750

90045001 8/2/12 4090023001 $934

90025640 8/2/12 4008007003 $1,023

90070013 8/2/12 4008007003 $125

Question:

What is the total

payroll for department

4008007 for July 2012?

Department = 4008007

Month = 7

Year = 2012

Automatically

Summed

Payroll

Depart Month/Year Amount

4008007 7/2012 $2,176

Date

Employee Cost Center PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/

Prof

Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/

Prof

Active

90070013 Bill Service Active

33

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/

Prof

Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/

Prof

Active

90070013 Bill Service Active

Emp. Payroll

Depart Month/Year Name Amount

4008007 7/2012

Bob $750

John $280

Mary $1,023

Bill $125

Total $2,176

Accessing a Star Schema Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Employee Date Cost Center Payroll

Amount

90025609 7/30/12 4008007002 $750

90045001 7/30/12 4090023001 $934

90033276 7/30/12 4008007003 $235

90025640 7/30/12 4008007003 $1,023

90070013 7/30/12 4008007003 $125

90033276 7/31/12 4008007003 $45

90025609 8/2/12 4008007002 $750

90045001 8/2/12 4090023001 $934

90025640 8/2/12 4008007003 $1,023

90070013 8/2/12 4008007003 $125

Question:

What is the total

payroll for department

4008007 for July 2012?

Department = 4008007

Month = 7

Year = 2012

Display by Employee

Name.

Automatically

Sum John’s

Values

Payroll

Depart Month/Year Amount

4008007 7/2012 $2,176

Date

Employee Cost Center

34

Key Concept #3: Understand Auto-Aggregation

• Cognos will always auto-aggregate facts for you in

dimensional models

•Only facts aggregate not attributes

• The fields you display in your report/query determine the

aggregation; it will aggregate facts across every

dimension attribute you do not display (after filtering)

• Tip: Filter first and be as narrow as you can

•Wide filters will require Cognos to retrieve and aggregate

a lot of rows

35

Key Concept #4: Understand Semi-Additive Facts

• A semi-additive fact is a fact that cannot be aggregated

for a certain dimension

• You must include at least one attribute from that

dimension in the display of your report/query or filter to

a single value in that dimension

• Noted as “Semi-additive on <dimension name>”

36

Semi-additive Fact Example

• In Employee Appointment, Head Count is semi-additive on

the Summary Month dimension; it cannot be aggregated

across months

• Anytime you use this fact, you must either filter to a

specific month or you must display your data by month

• If you do not display or filter by month, you will count the

same person multiple times and get an incorrect answer

37

History

• History happens naturally in the fact tables

• Changes can also occur that affect the dimensions

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/Prof Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/Prof Active

90070013 Bill Service Active

In the current version of our

Employee Dimension,

history is lost.

Example: John quit

38

Dimension History

• To keep history for a dimension, you need to create a new

row each time something changes

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/Prof Active

90045001 Sue Faculty Active

90033276 John Clerical Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/Prof Active

90070013 Bill Service Active

Now we have the history of

John but with two

problems:

1. We still don’t know when

the change happened

2. We have duplicated the

“key” that is used to link

to the fact table

39

Dimension History

• To fix the problem of knowing when changes happen, we

need to add some dates

PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

90045001 1/1/07 12/31/9999 Sue Faculty Active

90033276 1/1/07 7/31/2012 John Clerical Active

90033276 8/1/12 12/31/9999 John Clerical Withdrawn

90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

90070013 1/1/07 12/31/9999 Bill Service Active

Defines the timeframe for when

the data in given row applies

40

Dimension History

• To fix the problem of duplicate keys we need to add

another column that holds a special key that will never be

duplicated called a Surrogate Identifier (SID)

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 12/31/9999 Sue Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

41

Dimension History

• Let’s introduce another change

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

42

Emp. Payroll

Depart Date Name Amount

4090023 7/2012

Sue $934

Total $934

Dimension History Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Employee Date Cost Center Payroll

Amount

1 7/30/12 4008007002 $750

2 7/30/12 4090023001 $934

3 7/30/12 4008007003 $235

5 7/30/12 4008007003 $1,023

6 7/30/12 4008007003 $125

3 7/31/12 4008007003 $45

1 8/2/12 4008007002 $750

7 8/2/12 4090023001 $934

5 8/2/12 4008007003 $1,023

6 8/2/12 4008007003 $125

Question:

What is the total

payroll where:

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

Emp. Payroll

Depart Date Name Amount

4090023 8/2012

Suzie $934

Total $934

Department = 4090023

Month = August

Year = 2012

Date

Employee Cost Center

43

Emp. Payroll

Depart Date Name Amount

4090023 7/2012

Sue $934

Total $934

Dimension History Date Month Year Quarter Fiscal

Period

7/29/12 7 2012 Q3 1/2013

7/30/12 7 2012 Q3 1/2013

7/31/12 7 2012 Q3 1/2013

8/1/12 8 2012 Q3 2/2013

8/2/12 8 2012 Q3 2/2013

8/3/12 8 2012 Q3 2/2013

8/4/12 8 2012 Q3 2/2013

Cost Center Department Person

Responsible

Locked Flag

4008007002 4008007 Jose N

4008007003 4008007 Mary N

4090023001 4090023 Jane Y

Employee Date Cost Center Payroll

Amount

1 7/30/12 4008007002 $750

2 7/30/12 4090023001 $934

3 7/30/12 4008007003 $235

5 7/30/12 4008007003 $1,023

6 7/30/12 4008007003 $125

3 7/31/12 4008007003 $45

1 8/2/12 4008007002 $750

7 8/2/12 4090023001 $934

5 8/2/12 4008007003 $1,023

6 8/2/12 4008007003 $125

Question:

What is the total

payroll where:

Department = 4090023

Month = July

Year = 2012

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

Emp. Payroll

Depart Date Name Amount

4090023 7/2012

Sue $934

4090023 8/2012

Suzie $934

Total $1868

Department = 4090023

Month = July - August

Year = 2012

Date

Employee Cost Center

44

Dimension History

• A dimension in which you are keeping track of the changes

of its attributes is called a Slowly Changing Dimension

• You decide for each attribute in the dimension whether it

is slowly changing or not

• There are four types of slowly changing dimensions:

Type 1

Type 2

Type 3

Type 6

45

SCD – Type 1

•Only keep the current value

PERNR Name Employee

Group

Employment

Status

90025609 Bob Admin/Prof Active

90045001 Sue Faculty Active

90033276 John Clerical Withdrawn

90025640 Mary Admin/Prof Active

90070013 Bill Service Active

Advantage: Simple

Disadvantage: Lose history

46

SCD – Type 2

• Keep all historic values

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

Advantage: Can see history and partition historic facts

Disadvantage: Can’t associate new values with historic facts

47

SCD – Type 3

• Keep one historic value in another attribute

PERNR Name Previous

Name

Employee

Group

Employment

Status

90025609 Bob Bob Admin/Prof Active

90045001 Suzie Sue Faculty Active

90033276 John John Clerical Withdrawn

90025640 Mary Mary Admin/Prof Active

90070013 Bill Bill Service Active

Advantage: Can associate the current or previous value with facts

Disadvantage: Still lose history (only one previous value)

48

SCD – Type 6

• Combination of Types 1, 2 & 3

Advantage: Full history and can associate the current value with all facts

Disadvantage: Very Complex

SID PERNR First

Effective

Date

Last

Effective

Date

Name Current

Name

Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Suzie Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John John Clerical Active

4 90033276 8/1/12 12/31/9999 John John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Bill Service Active

49

Accessing a Dimension (Revisited)

Question:

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

Is John an active

employee as of today?

For a dimension with type 2 or 6

attributes, you must include some

date component in your query!

50

Accessing a Dimension (Revisited)

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

Current

Record

Flag

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active Y

2 90045001 1/1/07 8/1/2012 Sue Faculty Active N

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active Y

3 90033276 1/1/07 7/31/2012 John Clerical Active N

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn Y

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active Y

6 90070013 1/1/07 12/31/9999 Bill Service Active Y

51

Key Concept #5: Use the Correct Dates

SID PERNR First

Effective

Date

Last

Effective

Date

Name Employee

Group

Employment

Status

1 90025609 1/1/07 12/31/9999 Bob Admin/Prof Active

2 90045001 1/1/07 8/1/2012 Sue Faculty Active

7 90045001 8/2/12 12/31/9999 Suzie Faculty Active

3 90033276 1/1/07 7/31/2012 John Clerical Active

4 90033276 8/1/12 12/31/9999 John Clerical Withdrawn

5 90025640 1/1/07 12/31/9999 Mary Admin/Prof Active

6 90070013 1/1/07 12/31/9999 Bill Service Active

Payroll

Charge

Fact

Table

Date

Cost

Center Employee

When querying against a dimension

use First and Last Effective Dates (or Current Record Flag)

When querying against a

Star use one of the Date

Dimensions

52

Accessing Multiple Stars (a.k.a. cross-star query)

Payroll

Charge

Fact

Table

Date

Cost Center Employee

Employee

Leave

Date

Employee Leave Type

53

Accessing Multiple Stars

Payroll

Charge

Fact

Table

Cost Center Employee

Employee

Leave

Date

Leave Type

Conformed Dimensions

54

Accessing Multiple Stars

Payroll

Charge

Fact

Table

Cost Center

Employee

Leave

Leave Type Employee

Date

Emp. Payroll

Depart Date Leave Name Amount

4090023 8/2012 Sabbatical Suzie $934

55

Key Concept #6: Understand Conformed Dimensions

• A conformed dimension is a dimension used in more than

one star

• To do a cross-star query, the stars must share a conformed

dimension

56

Key Concept #7: Understand the

Dimensionality of Facts

•When doing cross-star queries, a fact cannot be

aggregated on attribute values for a dimension it does not

have.

• If you try to do this, Cognos will simply replicate the

values for the facts aggregated on the dimensions it does

have.

57

Dimensionality of Facts

Payroll

Charge

Fact

Table

Cost Center

Employee

Leave

Leave Type Employee

Date

Emp. Payroll

Depart Date Leave Name Amount

4090023 8/2012 Sabatical Suzie $934

58

Dimensionality of Facts

Emp. Payroll

Depart Date Leave Name Amount

4090023 8/2012 Sabbatical Suzie $934

59

Dimensionality of Facts

Emp. Payroll

Depart Date Leave Name Amount

4090023 8/2012 Sabbatical Suzie $934

4090023 8/2012 Sick Suzie $934

60

Key Concept #8: Understand Subset Conformed Dimensions

• Two dimensions can be treated as a single conformed

dimension provided one dimension is a complete subset of

the other

• For example, the Month Year dimension is a complete

subset of the Date Dimension

• If one star has the full dimension and another star has the

subset dimension, you can do a cross-star query

• But, the grain of the query can be no more detailed than

the grain of the subset dimension

61

Subset Dimensions

Payroll

Charge Fact

Table

Compensation

Plan Fact

Table

Date

Month Year

Cross star query possible, but the date grain

can be no more detailed than monthly totals.

62

Metadata

•Metadata = Data about data

For Example:

• Field Type (number, date, character, etc.)

• Length

• Definition

• Appropriate Usage

• Source of the Data

• There is a complete Metadata Repository for the data marts.

• It is a star that can be queried using Cognos.

• It will be covered in part 2 of the data training.

63

Security

• The data marts have a comprehensive security model

• There are three areas that are secured:

1. Cognos tool access (licenses)

2. Cognos content

3. Data Access

64

Cognos Tool Access Report Studio Query Studio

Business Insight

Advanced

Report Viewer

Professional Advanced Business Author Enhanced Consumer

Business Insight

License Type:

Super User Power User End User User Type:

65

Cognos Content Security

66

Cognos Content Security

Boiler Insight

Standard Content

Packages

Shared Content

Departmental Content

Location for “approved” Content Read-only for all

Write access for select few

Access controlled centrally

Private location for an area to

use for their own content Write access for that area

Access controlled by that area

Location for all packages Read-only for all

Write access for IT only

Access controlled centrally

Can be used for sharing BI

content with other users Write access for all

Access controlled centrally

Public Folders

67

Data Access Security

• Entire Data Mart

• Entire Star Schema

• Entire Dimension

• Specific Field(Attributes or Facts)

• Specific Records

More difficult

to implement

And negatively

impacts

performance!

68

Data Access Security

Level1 Level2 Level3

Least Restrictive Most Restrictive

HR Payroll Charge Fact Table

Date

Cost Center

Employee Age Gender Ethnicity DOB

69

Data Access Security

•We ended up with three subject

areas:

1. HR

2. OIE

3. IAMO

70

Data Access Security

Level1 Level2 Level3

Least Restrictive Most Restrictive

HR

Age Ethnicity DOB

OIE Disability

Career Account

Email Address

Employee

Gender

Highest Degree

PUID Visa

I9 Info Minority

Race Veteran

IAMO

71

Summary

You now know:

•General structure of dimensionally modeled data: data

marts, stars, dimensions, facts, etc.

•What dimensional modeled data looks like in Cognos

• The three ways to access data from a data mart:

• Single Dimension

• Star

• Multiple-stars

• How history is handled and the four types of slowly-

changing dimensions: Types 1,2,3,6

72

Summary (cont.)

You now know:

• The eight “Key Concepts” of dimensionally modeled data:

1. Understand the basic building blocks

2. Understand which access method you are using

3. Understand auto-aggregation

4. Understand semi-additive facts

5. Use the correct dates

6. Understand conformed dimensions

7. Understand the dimensionality of facts

8. Understand subset conformed dimensions

73

You now know:

•What metadata is

• The three areas of security:

1. Tool Access

2. Cognos Content Access

3. Data Access

Summary (cont.)

74

Questions?

75