+ All Categories
Home > Documents > Harsh_Ace_ETL Testing

Harsh_Ace_ETL Testing

Date post: 15-Aug-2015
Category:
Upload: harsh-kumar
View: 47 times
Download: 1 times
Share this document with a friend
Popular Tags:
17
© 2013 Wells Fargo Bank, N.A. All rights reserved. Internal use only. ETL Testing Harsh Kumar Senior Quality Analyst Bangalore 26/02/2015
Transcript
Page 1: Harsh_Ace_ETL Testing

© 2013 Wells Fargo Bank, N.A. All rights reserved. Internal use only.

ETL TestingHarsh KumarSenior Quality Analyst

Bangalore

26/02/2015

Page 2: Harsh_Ace_ETL Testing

2

Topic Covered

. Purpose of ETL Testing

. ETL Testing categories

. ETL Testing Process

. ETL Testing Techniques

. ETL Table Partition

. ETL Extraction Method

. ETL Transformation Example

. ETL Load Method

. ETL Tool Demo

Equality 7-2521
Topics to be covered (or) Agenda
Page 3: Harsh_Ace_ETL Testing

3

Purpose of ETL Testing

To Analyze data for critical business decisions.

Provides a platform to move data from various source into data warehouse.

Company data may be scattered into different location and different formats

It helps in bringing them to one consistent system.

Removes Mistakes and corrects Data.

Equality 7-2521
Please put in a real time scenario to make this more relavant to our audience.
Page 4: Harsh_Ace_ETL Testing

4

ETL Testing Categories

ETL or Data warehouse testing is categorized into four different engagements.

New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer requirements and different data sources and new data warehouse is build and verified with the help of ETL tools.

Migration Testing – In this type of project customer will have an existing DW and ETL performing the job but they are looking to bag new tool in order to improve efficiency.

Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.

Report Testing – Report are the end result of any Data Warehouse and the basic propose for which DW is build. Report must be tested by validating layout, data in the report and calculation.

Page 5: Harsh_Ace_ETL Testing

5

ETL Testing Process

Verify that data transformation from source to destination works as expected.

Verify that expected data is added in target system.

Verify that all DB fields and field data is loaded without any truncation.

Verify data checksum for record count match.

Verify that for rejected data proper error logs are generated with all details.

Verify NULL value fields.

Verify that duplicate data is not loaded.

Verify data integrity.

Equality 7-2521
Please add small examples in the slide below for each verification point
Page 6: Harsh_Ace_ETL Testing

6

ETL Process

The most time-consuming process in DW development

80% of development time is spent on ETL!

Extract relevant data

Transform data to DW format

Build keys, etc.

Cleansing of data

Load data into DW

Build aggregates, etc.

Equality 7-2521
The most time-consuming process in DW development
Page 7: Harsh_Ace_ETL Testing

7

ETL Testing Techniques:

Verify that data is transformed correctly according to various business requirements and rules.

Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.

Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.

Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.

Page 8: Harsh_Ace_ETL Testing

8

ETL Table Partition

Process that lets you decompose Large tables into smaller and more manageable pieces called partitions.

Range Partition (date column)

List Partiton (sales region column)

Hash Partiton (id column Tablespace)

Composite Partiton (Range+hash,range+list)

Microsoft Word Document

Equality 7-2521
Please add relavent examples / graphic representation here
Page 9: Harsh_Ace_ETL Testing

9

ETL Extraction Method

The process of reading the data from database.

There are 4 methods

Scripts in Linux Shell,Perl,Python

Sqlldr+SQL

Hardcoded Java,C#,JDBC

In house built ETL Tool

Page 10: Harsh_Ace_ETL Testing

10

ETL Transformation ExampleSource System Type of

transformationDW

Address Field:#123 ABC Street|XYZ City |1000Republic of MN

Field Splitting No: 123 Street: ABCCity: XYZCountry: Republic of MNPostal Code: 1000

System ACustomer title: PresidentSystem BCustomer title: CEO

Field Consolidation Customer title: President & CEO

Order Date:05 August 1998Order Date: 08/08/98

Standardization Order Date:05 August 1998Order Date: 08 August 1998

System Debit CardCustomer Name:Ramesh KumarOrg: Wells FargoAddress:BengaluruSystem Credit CardCustomer Name: Ramesh KumarOrg: Wells FargoAddress:Bengaluru

Deduplication Bank Account Details

Customer Name:Ramesh KumarOrg:Wells FargoAddress:Bengaluru

Page 11: Harsh_Ace_ETL Testing

11

ETL Load Method

The process of writing data into target database.

Slowly Changing dimension Load (Check Sum/Incremental)

Fact Table Load (Temp/Truncate)

Snapshot Table Load (Reporting,Temp,Trunc Fact)

Current State Table Load (Insert,Update)

Page 12: Harsh_Ace_ETL Testing

12

Star Schema(Denormalized)

Page 13: Harsh_Ace_ETL Testing

13

Snow Flake Schema(Normalized)

Page 14: Harsh_Ace_ETL Testing

14

SCD 1 & 2

SCD 1 –

SCD 2-

Customer Key Name State

1001 Williams New York

Customer Key Name State

1001 Williams Los Angeles

Customer Key

Name State Start Date

End Date

1001 Williams New York 16-Mar-2009

19-Feb-2010-

1005 Williams Los Angeles

20-Feb-2010

29-Jul-2012

1009 Williams California 30-Jul-2012

31-12-2999

Page 15: Harsh_Ace_ETL Testing

15

SCD-3

SCD 3-Customer Key Name State

1001 Williams New York

Customer Key

Name Original State

Current State

Effective Date

1001 Williams New York Los Angeles

20-FEB-2010

Customer Key

Name Original State

Current State

Effective Date

1001 Williams New York

California 30-JUL-2012

Page 16: Harsh_Ace_ETL Testing

16

ETL Tool OFSAA

OFSAA works as an integrator, extracting data from different sources; transforming it in preferred format based on the business transformation rules and loading it in Data Warehouse.http://wsvra00a0152.wellsfargo.com:9704/analytics

https://wspra00a0555.wellsfargo.com:16351/profitview/prelogin.jsp

Landing Area(History)

Oracle Financial Services Analytics Applications (OFSAA) User Front End

ProfitView OFSAA Solution Architecture (Release 4)

Type 2 Rules

Dashboard for

Profitability & Cross-sell

Dimensions Tables (Used across Staging and

Results Area)

Dimension Tables

Aggregation / Provisioning

Essbase CubesProduct

CustomerOfficer

Organization

Reference Tables (Used across Staging and

Results Area)

End User / Analyst

Results Area – FACT Tables(5 Years History)

Adjustments / Restatements

Adjustment Tables & DQ Checks

Perform ProfitView Specific Calculations

FACT Data for 3 Years – Aggregation & Drilldown

Data Provisioning

Data Provisioning

Tables

Batch Creation

Run Framework

Adjustments to ProfitMax

Adjustments

RPD

SOR Transactional Data

Data Sourcing

Manual Data Feeds

CDS – All Information Available in CDS Without Any SOR

Specific Filter

Transactional & Master Data

Data Quality Checks

Weekly Feed

iHUB§ TRIP§ REALM

SIMCORP§ T24§ Int. Calypso§ EASTDIL

Master Data and Reference Data

iHUB§ CPL§ ORBT§ ICIS

Dimensions & Hierarchy

§ Product - CPL§ Organization -

ORBT

Truncate & Reload

Manual Feeds **

All SOR Transactional & iHub Enriched

Data(Incl. SOR Specific

Contract ID & Supporting

Components)

EPM § COF Base Rate§ COF Term Floating § FTP Rate

§ Apply Adjustments § Financial Attribution

Computation§ Aggregation Prep

Common Staging Area(History)

Transaction Staging Tables

Dimension/Reference/

Security Staging Tables

CURE§ FX Rates

DRM§ GL Account

Hierarchy§ FP&A Hierarchy

Excel Upload§ Other Metrics Related

Reference Data Elements§ Internal Breakage Charge§ Managed Term Premium§ Security File Upload§ Adjustments

Weekly Feed –

Informatica ETL

Weekly Feed – Informatica ETL

Data Quality Checks (Technical and

Business)

iHub Normalized Data Structure

Weekly Feed – Informatica

ETL

Weekly Feed – Informatica

ETL

Weekly Feed – Informatica

ETL

Weekly Feed – Informatica

ETL

§ iHub Contract ID to OFSAA Contract (Acct Skey)

§ iHub Contract ID Components

§ Acct Skey to Product§ Acct Skey to Officer§ Acct Skey to Customer§ Acct Skey to AU

Contract to Dimension Mapping

Weekly Feed – Informatica

ETL

Adjustments Utility

Financial Attribution Agreement

Setup

Security Maintenance

Utility

§ Assigns Roles from OFSAA, Essbase and OBIEE Applications

Security Administrator

Adj. Excel

Upload

Reference Tables

T2T

Weekly Feed – Informatica ETL (Truncate & Reload)

Slowly Changing Dimensions ProcessData Quality Error Report

Adjustment Error Report

Adjustment Entry Screen

Lookup from Reference Tables

CPL Hierarchy Lookup from Dimension Tables

Latest Dimensions &

Hierarchies

Financial Adj. for Processing

Lookup Financial Attribution, Margins

& Shadow AU

Sending PMAX Specific Financial

Adjustments & Dimensional

Reassignments via Flat File

3-60 Months Data Archive

AUTOSYS - Batch Scheduling & MonitoringIntegration with AUTOSYS Job

Scheduler

OFSAA Batches

Lookup from Dimension Tables

Officer, Product, Customer, AU

Adjustments for Processing

Error Report Back to Users

Lookup Contracts to Adjust

CDS Landing Area

Legend:

Release 5 Scope Item

Release 6+ Scope Item

CDS§ MCV§ Officer

Atomic Schema

Reporting SchemaCDS

§ ICON§ PMAX§ OSDP

Data Quality Checks (Technical and

Business)

Equality 7-2521
Try blowing up this diagram 3x and split it into three different slides... It was really hard to read . relate to this in the last session.
Page 17: Harsh_Ace_ETL Testing

17

Thank You


Recommended