+ All Categories
Home > Documents > Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

Date post: 25-Dec-2015
Category:
Upload: brent-potter
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Transcript
Page 1: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

Week 6 Lecture

The Data Warehouse

Samuel Conn, Asst. Professor

Page 2: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

2

In this lecture, you will learn: How operational data and decision

support differ What a data warehouse is and how its

data are prepared What star schemas are and how they

are constructed What steps are required to implement a

data warehouse successfully What data mining is and what role it

plays in decision support

Page 3: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

3

External and internal forces require tactical and strategic decisions

Search for competitive advantage Business environments are dynamic Decision-making cycle time is reduced Different managers require different

decision support systems (DSS)

The Need for Data Analysis

Page 4: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

4

Decision Support Is a methodology Extracts information from data Uses information as basis for decision

making

Decision Support Systems

Page 5: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

5

Decision Support Systems Decision support system (DSS)

Arrangement of computerized tools Used to assist managerial decision Extensive data “massaging” to produce

information Used at all levels in organization Tailored to focus on specific areas and needs Interactive Provides ad hoc query tools

Page 6: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

6

DSS Components

Figure 13.1

Page 7: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

7

Operational data Relational, normalized database Optimized to support transactions Real time updates

DSS Snapshot of operational data Summarized Large amounts of data

Data analyst viewpoint Timespan Granularity Dimensionality

Operational vs. Decision Support Data

Page 8: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

8

Database schema Support complex (non-normalized) data Extract multidimensional time slices

Data extraction and filtering End-user analytical interface Database size

Very large databases (VLDBs) Contains redundant and duplicated data

The DSS Database Requirements

Page 9: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

9

Integrated Centralized Holds data retrieved from entire organization

Subject-Oriented Optimized to give answers to diverse questions Used by all functional areas

Time Variant Flow of data through time Projected data

Non-Volatile Data never removed Always growing

Data Warehouse

Page 10: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

10

Creating a Data Warehouse

Figure 13.3

Page 11: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

11

Single-subject data warehouse subset

Decision support to small group Can be test for exploring potential

benefits of Data warehouses Address local or departmental

problems

Data Marts

Page 12: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

12

DSS Architectural Styles

Traditional mainframe-based OLTP Managerial information system

(MIS) with 3GL First-generation departmental DSS First-generation enterprise data

warehouse using RDMS Second-generation data

warehouse using MDBMS

Page 13: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

13

Advanced data analysis environment Supports decision making, business

modeling, and operations research activities

Characteristics of OLAP Use multidimensional data analysis

techniques Provide advanced database support Provide easy-to-use end-user interfaces Support client/server architecture

Online Analytical Processing (OLAP)

Page 14: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

14

OLAP Client/Server Architecture

Figure 13.6

Page 15: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

15

OLAP Server Arrangement

Figure 13.7

Page 16: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

16

OLAP Server with Multidimensional Data Store Arrangement

Figure 13.8

Page 17: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

17

OLAP Server with Local Mini-Data-Marts

Figure 13.9

Page 18: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

18

OLAP functionality Uses relational DB query tools Extensions to RDBMS

Multidimensional data schema support Data access language and query

performance optimized for multidimensional data

Support for very large databases (VLDBs)

Relational OLAP (ROLAP)

Page 19: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

19

Typical ROLAP Client/Server Architecture

Figure 13.10

Page 20: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

20

OLAP functionality to multidimensional databases (MDBMS)

Stored data in multidimensional data cube

N-dimensional cubes called hypercubes

Cube cache memory speeds processing

Affected by how the database system handles density of data cube called sparsity

Multidimensional OLAP (MOLAP)

Page 21: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

21

MOLAP Client/Server Architecture

Figure 13.11

Page 22: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

22

Data-modeling technique Maps multidimensional decision support into

relational database Yield model for multidimensional data

analysis while preserving relational structure of operational DB

Four Components: Facts Dimensions Attributes Attribute hierarchies

Star Schema

Page 23: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

23

Simple Star Schema

Figure 13.12

Page 24: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

24

Slice and Dice View of Sales

Figure 13.14

Page 25: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

25

Facts and dimensions represented by physical tables in data warehouse DB Fact table related to each dimension table (M:1) Fact and dimension tables related by foreign keys Subject to the primary/foreign key constraints

Star Schema Representation

Page 26: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

26

Star Schema for Sales

Figure 13.17

Page 27: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

27

Normalization of dimensional tables Multiple fact tables representing different aggregation levels

Denormalization of the fact tables Table partitioning and replication

Performance-Improving Techniques for Star Schema

Page 28: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

28

Data Warehouse Implementation Road Map

Figure 13.21

Page 29: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

29

Seeks to discover unknown data characteristics

Automatically searches data for anomalies and relationships

Data mining tools Analyze data Uncover problems or opportunities Form computer models based on findings Predict business behavior with models Require minimal end-user intervention

Data Mining

Page 30: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

30

Extraction of Knowledge from Data

Figure 13.22

Page 31: Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor.

31

Data Mining Process

Figure 13.23


Recommended