By Rohit Agrawal Jan 2013
Is the data correct…?
Would table Cutomer_fact get loaded ever if Control Table has the following..?
What if the DW load frequency is daily and ETL execution time exceeds 24 hrs…?
2
CustID Title FirstName
LastName Gender DateOfBirth
Age
123 Mr Chris Joseph F 12/12/2000
22
DataBase Schema Table LastLoadDate LoadFrequency
Info_DB Customer Cust_fact 31/01/2012 D
Introduction
Testing Process
Focus Points
Challenges
Best Practices
3
Introduction
What ?• Exhaustive testing of a Data Warehouse during
its design and on an on-going basis
Why ?• Organisation decisions depend entirely on the
Enterprise data and the data should be of utmost quality
Where ?• Starting from Source till Reporting
When ?• Designing phase till Production
5
Testing Process
7
Focus Points
9
10
All Customer data from different bank branches are loaded
Insert data of Customers with age greater than 60 in senior citizen category
Error out the records if Customer does not belong to the Bank
11
Check February has 29 days in leap years only
A row in stage with AccountID=123 has the expected data in DW
Zip Code is of 6 digits, State names are properly abbreviated
12
Impact of Executing complex queries during data load/Rendering reports in 30 sec
Source system scheduling conflicts
Incremental loads as per Audit columns like LastUpdateDate/Incremental Flag
Voluminous data from heterogeneous sources
Data Quality not assured at source
Business knowledge. Organisation-wide Enterprise data knowledge may not be
feasible
Very high cost of quality .This is because, any defect slippage will translate into high
cost for the organisation
The heterogeneous data sources will be updated asynchronously
13
14
15
“If you torture data sufficiently, it will confess to almost anything.”
White Paper on Data Warehouse Testing- By Manoj Philip Mathen
Adventures with Testing BI/DW Application- http://msdn.microsoft.com
www.google.com
A Comprehensive Approach to Data Warehouse Testing- By Matteo Golfarelli
16
Thank You..!