+ All Categories
Home > Documents > Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is...

Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is...

Date post: 19-Apr-2018
Category:
Upload: ngohanh
View: 223 times
Download: 6 times
Share this document with a friend
63
Data Warehouse Enablement Through Bi-Level Data Modeling Charles W. Bachman
Transcript
Page 1: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data WarehouseEnablement Through

Bi-Level Data Modeling

Charles W. Bachman

Page 2: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Overview:

• decision support premise

• decision support problem

• data warehouse solution

• data complexity problem

• “bi- level” data model solution

Page 3: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Decision Support Premise

That there is valuable information lockedup in the current operational systems andother sources which, if it were readilyavailable, would contribute to improvedbusiness decision making and lead to amore efficient and profitable business.

Page 4: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Overview:

• decision support premise

• decision support problem

• data warehouse solution

• data complexity problem

• “bi- level” data model solution

Page 5: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Decision Support Problem

• many, independently developedoperational systems

• inadequate capacity and performance

• inappropriate data organization

• conflicting objectives

• insufficient data content

Page 6: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Integrated Database Solution

• single, integrated database

• unchallenged capacity and performance

• multiple, concurrent data organizations

• unified objectives

• integrated, current, archival andexternal data

Page 7: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Integrated Database Solution

single, integrated, all-purpose database

Page 8: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Integrated Database Solution

• idealized

• impractical

• too risky

• not in our time

• maybe never

Page 9: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Overview:

• decision support premise

• decision support problem

• data warehouse solution

• data complexity problem

• “bi- level” data model solution

Page 10: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution

operationalsystems data warehouse DSS

users

Page 11: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution

• pragmatic

• practical

• minimize risk

• compromising

• potentially, very expensive

Page 12: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 0)

operationalsystems data warehouse DSS

users

Page 13: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 1)

data warehouse

Page 14: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 1)

data warehouse

Page 15: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 2)

data warehouse

Page 16: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 3)

data warehouse

Page 17: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 4)

datamart

datawarehouse

Page 18: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 4)

The “data warehouse” in the stage 4 solutionprovides the one, consistent, integrated database thatwas sought by the “integrated database” solution.

All data marts are constructed from the informationobtained from that single data warehouse and thusyield consistent answers to all users.

Page 19: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

OperationalSystems

repetitive decisions

limited data

response intensive

frequent update

operational worker

Decision SupportSystems

one of a kind decisions

data intensive

relaxed response

read only

knowledge worker

Page 20: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

OLTP versus OLAP

The term “Online TransactionProcessing” (OLTP) is used tocharacterize operational systems.

The term “Online Analytic Processing”(OLAP) is used to characterize the newerdecision support systems.

But look out for “data mining.”

Page 21: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

OLAP/ROLAP/MOLAP/DOLAP

• OLAP

• Relational - OLAP

• Multidimensional - OLAP

• Distributed - OLAP

Page 22: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Mining(off-line analytical processing)

Data mining is executed by batchprograms, (demons) which, acting ontheir own, intelligently survey massesof data looking for significant patternsthat are obscured by the sheer mass ofthe available data.

Page 23: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Overview:

• decision support premise

• decision support problem

• data warehouse solution

• data complexity problem

• “bi- level” data model solution

Page 24: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

All Kinds of Data

• base and summary data

• current and archival data

• internal and external data

• content and context data

• real and meta data

Page 25: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Base and Summary Data

• last sales transaction

• individual sales transactions for the last 24 hours

• summary of sales by store, product and hourof day, for the last 24 hours

• summary of daily sales, for the last thirteen weeks

• summary of monthly sales, by district, for thelast ten years

Page 26: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Current and Archival Data

• current sales, current inventory, currentbank balances, current salaries, etc.

• sales transactions for the past ten years• personnel records as they have appeared,

change by change, since employment• claims history by insurance policy,

since the policy was written

Page 27: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Internal and External Data

Internal Data• company’s truck sales• company’s automobile sales

External Data• industry truck sales• industry automobile sales

Page 28: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Content and Contextual Data

• changed sales district boundaries• changed product grouping• changed accounting rules• changed pricing structure• changed product packaging• changed fiscal yearQuestion:How are the numbers to be understood?

Page 29: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Seventy Five (75) Units

• kilograms?• gallons?• feet?• cubic feet?• cartons?• metric tons?• cars?

Page 30: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Why were the Chicago districtsales so low in 1996?

92 93 94 95 96

trend line

?$

Page 31: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Real and Meta Data “meta” order no date quantity

12498 May 1 10 12944 May 7 21

“real” 13001 June 6 75

13749 July 9 15

14992 Aug 9 23

Page 32: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data and Data ModelA data model describes the data in a database.

database data

data model

Page 33: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Models,with Their DataDescriptions,Control Access toDatabases.

database

datamodel

Page 34: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

OperationalDatabases,with TheirData Models

datamodel

datamodel

datamodel

datamodel

Page 35: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Transformationsand Data Model Maps

operationalsystems

DSSusers

datawarehouse

datamodel

datamodel

datamodel

database database

Page 36: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Single Level Data Modelsand Their Data Model Maps

operationalsystems

DSSusers

datawarehouse

Page 37: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

m x (m - 1) Mappings

If there are “m” data models, then thereare potentially m x (m - 1) mappings tocreate and maintain.

Page 38: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Overview:

• decision support premise

• decision support problem

• data warehouse solution

• data complexity problem

• “bi- level” data model solution✔

Page 39: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Bi-Level Data Models

operationalsystems

DSSusers

datawarehouse

logical

logicaldata model

physical

Page 40: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

“m” Mappings

With Bi-Level Data Modeling, if thereare “m” physical data models, there areonly the m mappings, one between eachphysical data model and the singlelogical data model.

Page 41: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

DBA’s Challenge

The complexity of the data warehousesolutions, with so many operationalsystems, external sources, data martsand a single integrating data warehouse,creates a real challenge for the databaseadministrators.

Page 42: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Meta Data Focus

Just keeping track of the various datadescriptions used by the various files,databases, programs, and the mappingsthat join them, is a challenge.

Page 43: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Bi-level data models must handle a number ofprogramming languages, query languages, anddatabase management systems:

• COBOL• PL/1• IMS• IDMS• many varieties of SQL• Object Oriented databases• Multidimensional databases• and whatever might come next

Page 44: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse (stage 4)

datamart

datawarehouse

DSSusers

operationalsystems

Page 45: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Data Warehouse Solution (stage 1)

datamart

DSSusers

operationalsystems

format 1

format 2

format 3

Page 46: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

The “bi-level data model” approachoperates on the well known principleof “abstraction.”

It uses a single, higher level logical datamodel (conceptual schema) to understandthe semantics of differing physical leveldata models.

Page 47: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Levels of Abstraction

Logical Data Model

Physical Data Model

Machine Model

Micro Code Model

Physical Model

Page 48: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Meta Data Challenge

While the single, integrated databasesolution to operational and decisionsupport systems seems continually tobe beyond reach, the single, integratedmeta data model to support the datawarehouse solution is available today.

Page 49: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Logical Data Model

• key to bi-level data modeling• provides platform independence• target for reverse engineering existing databases and files• source for forward engineering to new databases• synchronizes logical and physical descriptions

Page 50: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Physical Modelsfor Data Warehousing

• spread sheets

• multidimensional schema

• star schema

• snowflake schema

Page 51: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Spread Sheet

1 - fact2 - dimensions

CELL

SHEET

ROW COLUMN

a

1

2

3

4

b c d

Page 52: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Spread Sheet

1 - fact2 - dimensions

SHEET

CUSTOMER

ORDER

PRODUCT

Page 53: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Cube

1 - fact3 - dimensions

CUSTOMERSHIPMENT

STOCK ITEM

PRODUCT

3-DCUBE

CUSTOMERSHIPMENTCUSTOMER CUSTOMER

SHIPMENT

STOCK ITEMORDER

LOCATION

Page 54: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Multidimensional Cube

1 - fact4 - dimensions, or more

CUSTOMERSHIPMENT

RESTOCKITEM

STOCK ITEM

PRODUCT LOCATION

MULTI-DCUBE

CUSTOMERSHIPMENTCUSTOMER CUSTOMER

SHIPMENT

STOCK ITEMORDER

TIMEPERIOD

Page 55: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Summary Data in the Cube

1 - set of summarized facts4 - dimensions YEARLY

TOTAL

MONTHLYTOTAL

CUSTOMERSHIPMENT

RESTOCKITEM

STOCK ITEM

PRODUCT LOCATION

MULTI-DCUBE

CUSTOMERSHIPMENTCUSTOMER CUSTOMER

SHIPMENTTIME

PERIOD

DAILYTOTAL

Page 56: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Star Schema

1 - fact4 - dimensions

product

location

time period

order

customer

Page 57: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Snow Flake Schema

product

location

district

store type

time period

order

customer

salesman

age group

Page 58: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Snow Flake Schema

product

location

district

store type

time period

order

customer

salesman

age group

Page 59: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Snow Flake Schema

product

location

district

store type

day of week

time of day

time period

order

customer

salesman

age group

Page 60: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Snow Flake Schema

TIMEOF DAY

CUSTOMERSHIPMENT

RESTOCKITEM

STOCK ITEM

PACKAGE DISTRICT

MULTI-DCUBES

CUSTOMERSHIPMENTSOURCECUSTOMER

SHIPMENTCUSTOMERSHIPMENT

CUSTOMERSHIPMENT

STOCK ITEMSUMMARY

STORETYPE

CUSTOMERSHIPMENT

DAYOF WEEK

CUSTOMERSHIPMENT

RESTOCKITEM

PRODUCT LOCATIONCUSTOMERSHIPMENTCUSTOMER CUSTOMER

SHIPMENTTIME

PERIOD

SALESPERSON

AGEGROUP

Page 61: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Multiple Multi-Cube Databases

orders purchases

production collections

Page 62: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

The OLAP/MOLAP/ROLAP/DOLAP view ofthe business world is a narrow view, designedand organized to answer questions about aspecified type of fact, quickly and easily. Itoffers a rifle shot into the designatedcollection of corporate data.

Multi-disciplinary questions crossing fromone star schema or multidimensional cube toanother generally require separate queries bythe person raising the question.

Page 63: Data Warehouse Enablement Through Bi-Level Data … · Decision Support Premise That there is valuable information locked up in the current operational systems and other sources which,

Summary: Bi-LevelData Model Architecture

• records the flow of data,• defines the necessary data transformations,• assures data consistency, and• catalogues the available data for decision support workers.


Recommended