+ All Categories
Home > Documents > STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05...

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05...

Date post: 27-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
19
Corporate Affairs and Marketing (CA&M) Brand and Event Management 1 STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa [email protected]
Transcript
Page 1: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

1

STRATEGIC INFORMATION SYSTEMS IV

STV401T / B

BTIP05 / BTIX05 - BTECH

DEPARTMENT OF INFORMATICS

LECTURE: 05 (A)

DATA WAREHOUSING (DW)

By: Dr. Tendani J. Lavhengwa

[email protected]

Page 2: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

Inspirational Quotes

• My personal quote:

“Always be a thought ahead. Do not fear the blank page, everything started somewhere”

• Quotes to consider as inspiration:

"Errors using inadequate data are much less than those using no data at all" ~ Charles

Babbage

“One is too small a number to achieve greatness” ~ John C. Maxwell

• Your quotes?

???

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 3: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

1. Literature in context and evolution

2. Data Warehousing (DW) - multiple definitions

3. Data Warehousing concepts

4. Data Warehousing fundamental characteristics

5. Key points from business on Data warehouses (IBM, 2018)

6. Traditional integration, Data Warehouse vs. Operational DBMS

7. OLTP systems vs. Data Warehouse

8. Application-Orientation vs. Subject-Orientation

9. Data Warehouse Models

10. Modelling of Data Warehouse – dimensions and measures

11. Organising data for Data Warehouses

12. Data Mart Centric

13. Data Warehouse Architecture - Base

14. Extract, Transform and Load (ETL) process

#. Start-up Items to discuss

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 4: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

1. Literature in context and evolution

data warehouse architecture

• -was born in the 1980s as an architectural model designed to support the flow of data from operational systems to decision support systems.

1988...

• Data Warehousing really saw its genesis in the late 1980s. An IBM Systems Journal article published in 1988, An architecture for a business information system, coined the term “business data warehouse,” although a future progenitor of the practice, Bill Inmon, used a similar term in the 1970s.

Later in the 1990s

• Inmon developed the concept of the Corporate Information Factory, an enterprise level view of an organization’s data of which Data Warehousing plays one part.

Russom (2015)

• Data warehouse architecture is being influenced by business practices and goals that continue to evolve. The reason: a well-aligned data warehouse reflects the business it serves.

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 5: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

2. Data Warehousing (DW) - multiple definitions

Turban et al. (2011)

• -a pool of data produced to support decision making

• -a repository of current and historical data of potential interest to managers througout the organisation

Bocij et al. (2015)

• Large database systems containing current and historical data that can be analysed to produce information to support organisational decision making

IBM.com (2018)

• databases provide a decision support system (DSS) environment in which you can evaluate the performance of an entire enterprise over time

William H. Inmon...

• -a subject-oriented, integrated, time-variant and nonvolatile collection of data that supports management's decision-making process.

Others...

• -a collection of corporate information and data derived from operational systems and external data sources.

• -designed to support business decisions by allowing data consolidation, analysis and reporting at different aggregate levels.

• -a federated repository for all the data that an enterprise's various business systems collect.

• -the Data Warehouse repository may be physical or logical.

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 6: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

3. Data Warehousing concepts

LECTURE: 05 - DATA WAREHOUSING (DW)

Data Warehouses are aimed at decision making

Data is populated into the DW through the processes of extraction, transformation and loading.

Data warehouse databases are optimized for data retrieval.

extraction, transformation and loading (ETL) - add figure

Page 7: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

4. Data Warehousing fundamental characteristics (1 of 2)

LECTURE: 05 - DATA WAREHOUSING (DW)

• -- data organised by detailed subject, only relevant for decision support

• --eg. sales, products, customer

-subject orientated -

• --is closed related to subject orientation

• -- DW must place data from different sources into a consistent format

• --presumed to be totally integrated

-integrated -

• --maintains historical data

• -the data does not necessarily provide current status (except for real-time systems)

• -they detect trends, deviations and long term relationships for forecasting and comparisons leading to decision making

-time variant -

• --once data is on the DW, users cannot change or update the data

• --Obsolete data are discarded and changes are recorded as new data

-non-volatile -

Page 8: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

4. Data Warehousing fundamental characteristics (2 of 2)

LECTURE: 05 - DATA WAREHOUSING (DW)

Additional characteristics

-Web based

-Relational / Multidimensional

-client / server

-Real-time

-Include metadata

Page 9: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

5. Key points from business on Data warehouses (IBM, 2018)

LECTURE: 05 - DATA WAREHOUSING (DW)

-A database that is optimized for data retrieval to facilitate reporting and analysis.

-A data warehouse incorporates information about many subject areas, often the entire enterprise.

-Typically you use a dimensional data model to design a data warehouse.

-The data is organized into dimension tables and fact tables using star and snowflake schemas.

-The data is denormalized to improve query performance.

Page 10: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

6. Traditional integration, Data Warehouse vs. Operational DBMS

LECTURE: 05 - DATA WAREHOUSING (DW)

Data Warehouse vs. Operational DBMS

Traditional heterogeneous DB integration A query driven approach

Data Warehouse: update-driven, high performance

Page 11: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

7. OLTP systems vs. Data Warehouse

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 12: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

8. Application-Orientation vs. Subject-Orientation

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 13: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

9. Data Warehouse Models

• Collects all the information about subjects in the entire organisation/enterprise

Enterprise Warehouse

• A subset of corporate-wide data that is of value to a specific group of users

• Example: Marketing, Sales, Finance Data Mart

• A set of views over operational databases

• Only some of the possible summary views may be materialised

Virtual Warehouse

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 14: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

10. Modelling of Data Warehouse – dimensions and measures

LECTURE: 05 - DATA WAREHOUSING (DW)

• --a fact table in the middle connected to a set of dimension tables

Start schema

• --a refinement of star schema where some dimensional hierarchy is normalised into a set of smaller dimension tables, forming a shape similar to snowflake

Snowflake schema

• --multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation

-Fact constellations

Page 15: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

11. Organising data for Data Warehouses

LECTURE: 05 - DATA WAREHOUSING (DW)

sample snowflake schema with

DAILY_SALES table as the fact table data mart with the

DAILY_SALES fact table

The data is organized into: -dimension tables -fact tables (using star and snowflake schemas)

Page 16: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

12. Data Mart Centric

LECTURE: 05 - DATA WAREHOUSING (DW)

Page 17: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

13. Data Warehouse Architecture - Base

LECTURE: 05 - DATA WAREHOUSING (DW)

Two-Tier Data Warehouse Architecture

Web-based Data Warehouse Architecture

Three-Tier Data Warehouse Architecture

Page 18: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

14. Extract, Transform and Load (ETL) process

LECTURE: 05 - DATA WAREHOUSING (DW)

a process in database usage and especially in data warehousing

Extracts

• data from homogeneous or heterogeneous data sources

• Extracting the data from different sources – the data sources can be files (like CSV, JSON, XML) or RDBMS etc

Transforms

• the data for storing it in proper format or structure for querying and analysis purpose

• Transforming the data – this may involve cleaning, filtering, validating and applying business rules

Loads

• it into the final target (database, more specifically, operational data store, data mart, or data warehouse)

• Loading – data is loaded into a data warehouse or any other database or application that houses data

Cleaning (e.g. “Male” to “M” and “Female” to “F” etc.)

Filtering (e.g. selecting only certain columns to load)

Enriching (e.g. Full name to First Name , Middle Name , Last Name)

Splitting a column into multiple columns and vice versa

Joining together data from multiple sources

Some activities carried out at "Transforming" stage:

Page 19: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 ...lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr_tendani... · 2/3/2018  · LECTURE: 05 - DATA WAREHOUSING (DW) -A database

Corporate Affairs and Marketing (CA&M) Brand and Event Management

19

QUESTIONS & ENQUIRIES

[email protected]

---

LECTURE: 05 (A) - DATA WAREHOUSING (DW)


Recommended