Date post: | 28-Nov-2014 |
Category: |
Documents |
Upload: | sudha-yallasiri |
View: | 1,038 times |
Download: | 4 times |
C3: ProtectedC3: Protected
Data Warehousing
Basics – I
Basic
Data Warehousing Data Warehousing
Basics Basics –– II
BasicBasic
©Copyright 2005, Cognizant Academy, All Rights Reserved 2
About the Author
3 years of experience in DW and BI toolsCredential
Information:
DWBASIC/PPT/1106/1.0Version and
Date:
Gughanand Dhananjayan (134701)Created By:
©Copyright 2005, Cognizant Academy, All Rights Reserved 3
Questions
A Welcome Break
Coding Standards
Demo Key Contacts
Reference
Test Your Understanding
Hands-on Exercise
Icons Used
©Copyright 2005, Cognizant Academy, All Rights Reserved 4
Data Warehousing Basics – I: Overview
Introduction:
This chapter explains the DataWarehouse and Data Marts
©Copyright 2005, Cognizant Academy, All Rights Reserved 5
Objective:
After completing this chapter, you will be able to:
Explain the Data Warehouse concepts
Describe the data marts and the architectures
Data Warehousing Basics – I: Objectives
©Copyright 2005, Cognizant Academy, All Rights Reserved 6
What is an Operational System?
• Operational systems are what its name implies. It is the system that help you to
run the day-to-day enterprise operations.
• These are the backbone systems of any enterprise, such as order entry
inventory.
• The classic examples are airline reservations, credit-card authorizations, ATM
withdrawals, and so on.
• Because of their importance to the organization, operational systems were
almost always the first parts of the enterprise to be computerized.
• Over the years, these operational systems have been extended and rewritten,
enhanced and maintained to the point that they are completely integrated into the
organization.
©Copyright 2005, Cognizant Academy, All Rights Reserved 7
Following are the characteristic of operational system:
• Continuous availability
• Predefined access paths
• Transaction integrity
• High volume of transaction
• Low data volume per query
• Is used by operational staff
• Supports day to day control operations
• Supports large number of users
Characteristics of Operational Systems
©Copyright 2005, Cognizant Academy, All Rights Reserved 8
DataData InformationInformation KnowledgeKnowledge
The goal of Informational Processing is to turn data into information!
Why?
Because business questions are answered using information and
the knowledge of how to apply that information to a given problem.
Historical Look at Informational
Processing
©Copyright 2005, Cognizant Academy, All Rights Reserved 9
• Data: Informational data is distinctly different from operational data in its structure
and content .
• Processing: Informational processing is distinctly different from operational
processing in its characteristics and use of data
Need for a Separate Informational
System
©Copyright 2005, Cognizant Academy, All Rights Reserved 10
Requirements for
the report must
be clarified
Management requires
business information
A request for a
report is made to
the Information
Center
Information
Center works on
developing the
report
The Information Center
©Copyright 2005, Cognizant Academy, All Rights Reserved 11
• Report provided to analyst
• Analyst manipulates data for decision
making
• Management receives information, but...
What took so long? and
How do I know it’s right?
The Information Center (Contd.)
©Copyright 2005, Cognizant Academy, All Rights Reserved 12
Too Many Steps Involved!
The Information Center (Contd.)
©Copyright 2005, Cognizant Academy, All Rights Reserved 13
Inventory Control System
Production quantity
Transported
Quantity
Order quantity
• Supports day to day control operations
• Transaction Processing
• High Performance Operational Systems
• Fast Response Time
• Initiates immediate action
OLTP Server
Tactical Information
©Copyright 2005, Cognizant Academy, All Rights Reserved 14
Payroll
• Understand Business Issues
• Analyze Trends and Relationships
• Analyze Problems
• Discover Business Opportunities
• Plan for the Future
Finance
Marketing
Production
and Inventory
Strategic Information
©Copyright 2005, Cognizant Academy, All Rights Reserved 15
• Operational data helps the organization to meet the operational and
tactical requirements for data.
• While the Data Warehouse data helps the organization to meet
strategic requirements for information.
OLTP Server
Strategic
InformationTactical
Information
Operational
DataPeriodic
Refresh
Data Warehouse Server
Need for Tactical and Strategic
Information
©Copyright 2005, Cognizant Academy, All Rights Reserved 16
Operational Vs Analytical Systems
De-normalized designNormalized design
Supports long-term informational
requirements
Supports day-to-day business
functions
Historical integrityReferential integrity
Summarized dataHighly detailed data
Managed redundancyMinimal redundancy
Less frequently updatedConstantly updated
Historical; accuracy maintained over timeCurrent; accurate as of now
Primarily derivedPrimarily primitive
AnalyticalOperational
©Copyright 2005, Cognizant Academy, All Rights Reserved 17
The Data Warehouse is:
• Subject oriented
• Integrated
• Time variant
• Non-volatile
collection of data in support of management decision processes
Data Warehouse: Definition
©Copyright 2005, Cognizant Academy, All Rights Reserved 18
Accounting
Order
Entry
Billing
Customer
Usage
Revenue
Operational data is organized by specific processes or tasks and is maintained by separate systems
Warehoused data is organized by subject area and is populated from many operational systems
Operational Systems
Data Warehouse
Data Warehouse: Differences from
Operational Systems
©Copyright 2005, Cognizant Academy, All Rights Reserved 19
Application Specific Integrated
• Applications and their
databases were designed and
built separately
• Evolved over long periods of
time
• Integrated from the start
• Designed (or “Architected”) at
one time, implemented
iteratively over short periods of
time
OperationalSystems
Data Warehouse
Data Warehouse: Differences from
Operational Systems (Contd.)
©Copyright 2005, Cognizant Academy, All Rights Reserved 20
Primarily concerned with current data
Generally concerned with historical data
OperationalSystems
DataWarehouse
Data Warehouse: Differences from
Operational Systems (Contd.)
©Copyright 2005, Cognizant Academy, All Rights Reserved 21
Load/
Update
Consistent Points in Time
• Updated constantly
• Data changes according by
need, not as a fixed
schedule
• Added regularly, but loaded data are
rarely changed directly
• Does not mean the data warehouse is
never updated or never changes!!
Constant Change
Operational systems
Database
Data warehouse
Insert
Insert
Update
Initial Load
Incremental Load
Incremental Load
Update
Delete
Data Warehouse: Differences from
Operational Systems (Contd.)
©Copyright 2005, Cognizant Academy, All Rights Reserved 22
Data in a Data Warehouse
What about the data in the Data Warehouse?
• Separate DSS data base
• Storage of data only, no data is created
• Integrated and scrubbed data
• Historical data
• Read-only (no recasting of history)
• Various levels of summarization
• Meta data
• Subject oriented
• Easily accessible
©Copyright 2005, Cognizant Academy, All Rights Reserved 23
The following are the features of Data Warehousing:
• Strategic enterprise level decision support
• Multi-dimensional view on the enterprise data
• Caters to the entire spectrum of management
• Descriptive, standard business terms
• High degree of scalability
• High analytical capability
• Historical data only
Data Warehousing Features
©Copyright 2005, Cognizant Academy, All Rights Reserved 24
Data Warehouse: Business Benefits
Benefits to business:
• Understand business trends
• Better forecasting decisions
• Better products to market in timely manner
• Analyze daily sales information and make quick decisions
• Solution for maintaining your company's competitive edge
©Copyright 2005, Cognizant Academy, All Rights Reserved 25
Data Warehouse: Application Areas
Following are some Business Applications of a Data Warehouse:
• Risk management
• Financial analysis
• Marketing programs
• Profit trends
• Procurement analysis
• Inventory analysis
• Statistical analysis
• Claims analysis
• Manufacturing optimization
• Customer relationship management
©Copyright 2005, Cognizant Academy, All Rights Reserved 26
Data Mart
Data Mart
Enterprise
Data
Warehouse
Data Marts: Overview
• Data mart is a decentralized subset of data found either in a Data Warehouse or
as a standalone subset designed to support the unique business unit
requirements of a specific decision-support system.
• Data marts have specific business-related purposes such as measuring the
impact of marketing promotions, or measuring and forecasting sales
performance, and so on.
©Copyright 2005, Cognizant Academy, All Rights Reserved 27
Data Marts: Features
The following are few important features of data marts:
• Low cost
• Controlled locally rather than centrally, conferring power on the user group
• Contain less information than the warehouse
• Rapid response
• Easily understood and navigated than an enterprise Data Warehouse
• Within the range of divisional or departmental budgets
©Copyright 2005, Cognizant Academy, All Rights Reserved 28
Advantages of Data Mart over Data
Warehouse
Data mart advantages:
• Typically single subject area and fewer dimensions
• Limited feeds
• Very quick time to market (30-120 days to pilot)
• Quick impact on bottom line problems
• Focused user needs
• Limited scope
• Optimum model for DW construction
• Demonstrates ROI
• Allows prototyping
©Copyright 2005, Cognizant Academy, All Rights Reserved 29
Disadvantages of Data Mart
Data Mart disadvantages:
• Does not provide integrated view of business information.
• Uncontrolled proliferation of data marts results in redundancy.
• More number of data marts complex to maintain.
• Scalability issues for large number of users and increased data volume.
©Copyright 2005, Cognizant Academy, All Rights Reserved 30
Enterprise Data
Warehouse
Metadata
Source systems
Data Staging
End user access
Architected Data Warehouse
©Copyright 2005, Cognizant Academy, All Rights Reserved 31
• Architected
• Data and results consistent
• Redundancy is managed
• Detailed history available for drill-
down
• Metadata is consistent!
� Easy to do, Not architected
? Are the extracts,
transformations, integration's
and loads consistent?
? Is the redundancy managed?
? What is the impact on the
sources?
Unarchitected Data Marts Data Warehouse
EnterpriseData Warehouse
Enterprise
Data Warehouse
Metadata
Source systems
Data Staging
End user access
Source systems Data marts End user access
Unarchitected Data marts Vs Data
Warehouse
©Copyright 2005, Cognizant Academy, All Rights Reserved 32
The Operational Data Store (ODS) is defined to be a structure that is:
• Integrated
• Subject oriented
• Volatile, where update can be done
• Current valued, containing data that is a day or perhaps a month old
• Contains detailed data only
Operational Data Store Definition
©Copyright 2005, Cognizant Academy, All Rights Reserved 33
• To obtain a “system of record” that contains the best data that exists in a legacy
environment as a source of information.
• Best data implies to be:
– Complete
– Up to date
– Accurate
• In conformance with the organization’s information model.
ODS: Needs
©Copyright 2005, Cognizant Academy, All Rights Reserved 34
• ODS data resolves data integration
issues.
• Data physically separated from
production environment to insulate it
from the processing demands of
reporting and analysis.
• Access to current data facilitated.Tactical
Analysis
OLTP Server
ODS
ODS: Insulated from OLTP
©Copyright 2005, Cognizant Academy, All Rights Reserved 35
ODS: Data
• Detailed data: Records of Business Events, for example, orders capture
• Data from heterogeneous sources
• Does not store summary data
• Contains current data
©Copyright 2005, Cognizant Academy, All Rights Reserved 36
Flat
files
Relational
Database
Operational
Data Store
60,5.2,”JOHN”
72,6.2,”DAVID”
Excel files
ODS: Benefits
The following are the benefits of ODS:
• Integrates the data
• Synchronizes the structural differences in data
• High transaction performance
• Serves the operational and DSS environment
• Transaction level reporting on current data
©Copyright 2005, Cognizant Academy, All Rights Reserved 37
• Update schedule - Daily or less
time frequency
• Detail of Data is mostly between
30 and 90 days
• Addresses operational needs
• Weekly or greater time frequency
• Potentially infinite history
• Address strategic needs
ODS DataData
WarehouseData
Operational Data Store: Update schedule
©Copyright 2005, Cognizant Academy, All Rights Reserved 38
ODS Vs Data Warehouse Characteristics
©Copyright 2005, Cognizant Academy, All Rights Reserved 39
What is OLAP
• OLAP tools are used for analyzing data
• It helps users to get an insight into the organizations data
• It helps users to carry out multi dimensional analysis on the available data
• Using OLAP techniques users will be able to view the data from different
perspectives
• Helps in decision making and business planning
• Converting OLTP data into information
• Solution for maintaining your company's competitive edge
©Copyright 2005, Cognizant Academy, All Rights Reserved 40
OLAP Terminology
• Drill Down and Drill Up
• Slice and Dice
• Multi dimensional analysis
• What IF analysis
©Copyright 2005, Cognizant Academy, All Rights Reserved 41
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
Information Information
AccessAccess
Reporting tools
Web
Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
Basic Data Warehouse Architecture
©Copyright 2005, Cognizant Academy, All Rights Reserved 42
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
• The database-of-
record
• Consists of system
specific reference
data and event data
• Source of data for the
Data Warehouse
• Contains detailed
data
• Continually changes
due to updates
• Stores data up to the
last transaction
Operational
& External
Data Layer
Operational and External Data Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 43
• Extracts data from
operational and
external databases
• Transforms the data
and loads into the
Data Warehouse
• This includes
decoding production
data and merging of
records from multiple
DBMS formats
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
Data
Staging
layer
Data Staging Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 44
• Stores data used for
informational analysis
• Present summarized
data to the end-user for
analysis
• The nature of the
operational data, the
end-user requirements
and the business
objectives of the
enterprise determine
the structure
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
Data
Warehouse
Layer
Data Warehouse Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 45
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
• Metadata is data
about data.
• Stored in a
repository.
• Contains all
corporate
Metadata
resources:
database
catalogs, and
data dictionaries
Meta Data Layer
Meta Data Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 46
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
Process Management Layer
• Scheduler or
the high-level
job control
• To build and
maintain the
Data
Warehouse
and data
directory
information
• To keep the
Data
Warehouse
up-to-date.
Process Management Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 47
Meta Data Management
Administration
Mining
Operational
& External
data
ODS
Data
Staging
layer
InformationInformation
AccessAccess
Reporting tools
Web Browsers
OLAP
Data warehouse
Information
Servers
Data
Marts
Information Access Layer
• Interfaced with the
Data Warehouse
through an OLAP
server
• Performs analytical
operations and
presents data for
analysis
• End-users generates
ad-hoc reports and
perform
multidimensional
analysis using OLAP
tools
Information Access Layer
©Copyright 2005, Cognizant Academy, All Rights Reserved 48
Data Warehouse Architecture:
Implementation
The following should be considered for a successful implementation of a Data
Warehousing solution:
• Architecture:
– Open Data Warehousing architecture with common interfaces for product
integration
– Data Warehouse database server
• Tools:
– Data Modeling tools
– Extraction and Transformation/propagation tools
– Analysis/end-user tools: OLAP and Reporting
– Metadata Management tools
©Copyright 2005, Cognizant Academy, All Rights Reserved 49
Enterprise Data Warehouse (EDW)
• An Enterprise Data Warehouse (EDW) contains detailed as well as summarized
data.
• Separate subject-oriented database.
• Supports detailed analysis of business trends over a period of time.
• Used for short- and long-term business planning and decision making covering
multiple business units.
©Copyright 2005, Cognizant Academy, All Rights Reserved 50
Heterogeneous Source Systems
Staging
Common Staging interface Layer
Data mart bus architecture Layer
Enterprise Datawarehouse
Source
1
Source
2
Source
3
Incremental Architected data marts
DM 1 DM 3DM 2
EDW- “Top Down” Approach
©Copyright 2005, Cognizant Academy, All Rights Reserved 51
• An EDW is composed of multiple subject areas, such as Finance, Human
Resources, Marketing, Sales, Manufacturing, and so on.
• In a top down scenario, the entire EDW is architected, and then a small slice of a
subject area is chosen for construction.
• Subsequent slices are constructed, until the entire EDW is complete.
EDW - “Top Down” Approach:
Implementation
©Copyright 2005, Cognizant Academy, All Rights Reserved 52
• The upsides to a “Top Down” approach are:
1. Coordinated environment
2. Single point of control & development
• The downsides to a “Top Down” approach are:
1. “Cross everything” nature of enterprise project
2. Analysis paralysis
3. Scope control
4. Time to market
5. Risk and exposure
Upsides and Downsides of Top-Down
Approach
©Copyright 2005, Cognizant Academy, All Rights Reserved 53
Heterogeneous Source Systems
Staging
Common Staging interface Layer
Data mart bus architecture Layer
Source
1
Source
2
Source
3
Incremental Architected data marts
DM 1 DM 3DM 2
Enterprise Datawarehouse
EDW- “Bottom up” Approach
©Copyright 2005, Cognizant Academy, All Rights Reserved 54
EDW- “Bottom Up” Approach:
Implementation
• Initially an Enterprise Data Mart Architecture (EDMA) is developed.
• Once the EDMA is complete, an initial subject area is selected for the first
incremental Architected Data Mart (ADM).
• The EDMA is expanded in this area to include the full range of detail required for
the design and development of the incremental ADM.
©Copyright 2005, Cognizant Academy, All Rights Reserved 55
Upsides and Downsides of Bottom Up
Approach
• The upsides to a “bottom up” approach are:
1. Quick Return On Investment (ROI)
2. Low risk, low political exposure learning and development environment
3. Lower level, shorter-term political will required
4. Fast delivery
5. Focused problem, focused team
6. Inherently incremental
• The downsides to a “bottom up” approach are:
1. Multiple team coordination
2. Must have an EDMA to integrate incremental data marts
©Copyright 2005, Cognizant Academy, All Rights Reserved 56
• Allow time for questions from participants
©Copyright 2005, Cognizant Academy, All Rights Reserved 57
Test Your Understanding
1. What is the difference between Data Warehouse and Data Marts?
2. Explain Top-down and Bottom-up Approach.
3. What are the differences between ODS in DW and OLTP?
4. Explain OLAP.
©Copyright 2005, Cognizant Academy, All Rights Reserved 58
Data Warehousing Basics – I: Summary
• Operational systems are what its name implies. It is the system that help you to
run the day-to-day enterprise operations.
• The following are the features of Data Warehousing:
– Strategic enterprise level decision support
– Multi-dimensional view on the enterprise data
– Caters to the entire spectrum of management
– Descriptive, standard business terms
– High degree of scalability
– High analytical capability
– Historical data only
• Data mart is a decentralized subset of data found either in a Data Warehouse or
as a standalone subset designed to support the unique business unit
requirements of a specific decision-support system.
©Copyright 2005, Cognizant Academy, All Rights Reserved 59
Data Warehousing Basics – I: Summary
(Contd.)
• Data marts have specific business-related purposes such as measuring the
impact of marketing promotions, or measuring and forecasting sales
performance, and so on.
• The Operational Data Store (ODS) is defined to be a structure that is integrated,
subject oriented, volatile, current valued, and Contains detailed data only.
• An Enterprise Data Warehouse (EDW) contains detailed as well as summarized
data.
• An EDW is composed of multiple subject areas, such as Finance, Human
Resources, Marketing, Sales, Manufacturing, and so on.
©Copyright 2005, Cognizant Academy, All Rights Reserved 60
Data Warehousing Basics – I: Source
• Datawarehousing Guide - Raphal Kimball
Disclaimer: Parts of the content of this course is based on the materials available from the
Web sites and books listed above. The materials that can be accessed from linked sites are
not maintained by Cognizant Academy and we are not responsible for the contents thereof.
All trademarks, service marks, and trade names in this course are the marks of the
respective owner(s).
You have successfully
completed
Data Warehousing
Basics – I.
You have successfully You have successfully
completedcompleted
Data Warehousing Data Warehousing
Basics Basics –– I.I.