Confidential 2
AGILE DATA WAREHOUSE DESIGNDao Vo
Confidential 3
AGENDA
• Overview of data warehousing • Designing and implementing a data
warehouse• Waterfall BI/WH development• Agile BI/WH development framework• Q&A
Confidential 4
OVERVIEW OF DATA WAREHOUSINGWhat is a data warehouse?
Confidential 5
OVERVIEW OF DATA WAREHOUSING
• The business problem• What is a data warehouse?• BI/WH Architectures
THE BUSINESS PROBLEM
• Key business data is distributed across multiple systems
THE BUSINESS PROBLEM
• Finding the information required for business decision making is time-consuming and error-prone
THE BUSINESS PROBLEM
• Fundamental business questions are hard to answer
WHAT IS A DATA WAREHOUSE?
WHAT IS A DATA WAREHOUSE?
• A centralized store of business data for reporting and analysis
• Typically, a data warehouse:– Contains large volumes of historical data– Is optimized for querying data (as
opposed to inserting or updating)– Is incrementally loaded with new
business data at regular intervals– Provides the basis for enterprise
business intelligence solutions
Confidential 11
DESIGNING AND IMPLEMENTING A DATA WAREHOUSE
How to design a data warehouse and BI solution?
Confidential 12
DESIGN AND IMPLEMENT WH
• Introduction to Dimensional Modeling• Star Schemas• Considerations for Dimension Tables• Considerations for Fact Tables• Snowflake Schemas
Confidential 13
WAREHOUSE MODELING
INTRODUCTION TO DIMENSIONAL MODELING
• Business questions focus on measures that are aggregated by business dimensions
• Measures are facts about the business
• Dimensions are ways in which the measures can be aggregated
Product Line
Salesperson Product
Time
CustomerRegionQuantityRevenue
CostProfit
STAR SCHEMAS• Group related
dimensions into dimension tables
• Group related measures into
fact tables• Relate fact tables
to dimension tables by using foreign keys
DimSalesPersonSalesPersonKeySalesPersonNameStoreNameStoreCityStoreRegion
DimProductProductKeyProductNameProductLineSupplierName
DimCustomerCustomerKeyCustomerNameCityRegion
FactOrdersCustomerKeySalesPersonKeyProductKeyShippingAgentKeyTimeKeyOrderNoLineItemNoQuantityRevenueCostProfit
DimDateDateKeyYearQuarterMonthDay
DimShippingAgentShippingAgentKeyShippingAgentName
SNOWFLAKE SCHEMASDimSalesPersonSalesPersonKeySalesPersonNameStoreKey
DimProductProductKeyProductNameProductLineKeySupplierKey
DimCustomerCustomerKeyCustomerNameGeographyKey
FactOrdersCustomerKeySalesPersonKeyProductKeyShippingAgentKeyTimeKeyOrderNoLineItemNoQuantityRevenueCostProfit
DimDateDateKeyYearQuarterMonthDay
DimShippingAgentShippingAgentKeyShippingAgentName
DimProductLineProductLineKeyProductLineName
DimGeographyGeographyKeyCityRegion
DimSupplierSupplierKeySupplierName
DimStoreStoreKeyStoreNameGeographyKey
Confidential 17
WAREHOUSE MODELING
Confidential 18
WATERFALL BI/WH DEVELOPMENTTraditional SDLC to develop a BI/WH product
Confidential 19
WATERFALL BI/WH DEVELOPMENT
• SDLC Overview
Confidential 20
WATERFALL BI/WH DEVELOPMENT
Confidential 21
SDLC OVERVIEW
Confidential 22
AGILE BI/WH DEVELOPMENT FRAMEWORK
Incremental development framework for BI/WH product
Confidential 23
AGILE BI/WH DEVELOPMENT
FRAMEWORK• Agile BI/WH life cycle• Agile DW design overview• Agile ETL Solution
Confidential 24
AGILE BI/WH LIFE CYCLE
Confidential 25
AGILE BI/WH LIFE CYCLE
Confidential 26
AGILE DW DESIGN OVERVIEWHow to design to answer business question?
Confidential 27
AGILE DW DESIGN OVERVIEW
• How do we ask question?• The 7Ws framework• Design using natural language• Straightforward methodology• Model storming• BEAM methodology
Confidential 28
HOW DO WE ASK QUESTION?
• Events/Transactions– A immutable "fact" that occurs in a time
and place• Interrogatives:–Who, What, When, Where, Why– Descriptive context that fully describes
the event– A set of “dimensions" that describe
events
Confidential 29
THE 7WS FRAMEWORK
WhyWhere
How
WhoWhenWhat
HowMany
THE 7WS FRAMEWORK
HOW – FACTsMuchManyOften£$€
WhoCustomerEmployee
SellerOrganization
WhatProductService
TransactionsBooking
Event
WhyCausal
PromotionReason
WeatherCompetition
WhereLocation
GeographicStore
Ship toHospital
WhenTimeDay
MonthYear
Confidential 31
DESIGN USING NATURAL LANGUAGE
• Verbs – Events – Relationships – Fact Tables
• Nouns – Details – Entities – Dimensions
• Main Clause – Subject-Verb-Object• Prepositions – connect additional
details to the main clause• Interrogatives – The 7Ws – Dimension
Types
STRAIGHTFORWARD METHODOLOGY
Confidential
Who
What
When
Where
How (many)
Why
How
1
3
11111
4
5
2
6
7
8
Declare Event Type
Subject-Verb-Object
Quantities - Facts
Sufficient Detail Fact Granularity
Initial Data Examples
9
Confidential 33
DESIGN USING NATURAL LANGUAGE
• Verbs – Events – Relationships – Fact Tables
• Nouns – Details – Entities – Dimensions
• Main Clause – Subject-Verb-Object• Prepositions – connect additional
details to the main clause• Interrogatives – The 7Ws – Dimension
Types
Confidential 34
BUSINESS EVENT ANALYSIS AND MODELING (BEAM✲)
An agile approach to dimensional modeling
MODEL STORMING
Confidential 35
Quick
Data Modeler BI Stakeholders
Inclusive
Interactive
Fun
Confidential 36
BEAM ✲ METHODOLOGYStructured, non-technical, collaborative working conversation directly with BI
Users
• BI User’s Business Process, Organizational, Hierarchical, and Data Knowledge• Focused Data
Profiling
• Logical and Physical Dimensional Data Models
• Example data• Detailed and
Testable ETL Specification• DW
Prototype
BEAM✲
Data
Modeler
BI Stakeholders
37
Q&A
© 2013 KMS Technology
THANK YOU.