+ All Categories
Home > Technology > Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

Date post: 04-Dec-2014
Category:
Upload: oracle-user-group-estonia
View: 3,145 times
Download: 2 times
Share this document with a friend
Description:
Event: Oracle Technology Day 2011 Date: 20.10.2011 Place: Nordic Hotel Forum Country: ESTONIA
46
© Swedbank Mart Tudre – Swedbank Baltic DW architect Rein Adamson – Project Manager Oracle Data Integrator ETL software in Swedbank EDW 2007 – 2011
Transcript
Page 1: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Mart Tudre – Swedbank Baltic DW architect

Rein Adamson – Project Manager

Oracle Data Integrator ETL software in Swedbank EDW 2007 – 2011

Page 2: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Agenda

• EDW - Enterprise Data Warehouse– EDW, BI definitions

– Swedbank Baltic DW - general facts

• ETL software evaluation 2007– ETL Software evaluation and Proof of Concept 2007

– ODI Implementation project

– User roles today

• ODI implementation in Swedbank Baltic DW – ODI defining features

– Usage specifics and custom components

2

Page 3: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Data WareHouse – a definition

• A data warehouse is a repository of an organization's electronically stored data, designed to facilitate reporting and analysis.

• An expanded definition for data warehousing includes tools for

– business intelligence

– extracting, transforming and loading data into the repository

– to manage and retrieve metadata.

Business intelligence - computer-based techniques used in spotting, digging-out, and analyzing business data

Source: wikipedia.org

ETL – Extract, Transform, Load

EDW – Enterprise Data Warehouse (also IT org.unit in Swedbank)

3

Page 4: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Business Intelligence functions

• predictive analytics (statistics, data mining)

• online analytical processing (OLAP)

• business performance management

• benchmarking

• text mining

• reporting

4

Page 5: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Data Warehouse architecture

Operational Data

Data Transformation

Enterprise Data Warehouse, Integrated Data Marts

Replication

Source Business Users

Analytical Users

5

Page 6: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

CARDSDEPOSIT LEASINGLOAN GL...

CMFM RM CB

Data delivery

F I N A N C ET h e i n t e r n a l a c c o u n t i n go f t h e b u s i n e s s .

P A R T Y A S S E TT h i n g s p a r t i e s h a v e a n i n t e r e s t i nt h a t h a v e v a l u e .

P R O D U C TA n y m a r k e t a b l e p r o d u c to r s e r v i c e i n c l u d i n g t e r m s ,c o n d i t i o n s a n d f e a t u r e s .

I N T E R N A L O R G A N I Z A T IO NA P a r t y t h a t i s a u n i t o f b u s i n e s s .

L O C A T I O NA p h y s i c a l a d d r e s s ,e l e c t r o n i c a d d r e s s o r g e o g r a p h i c a l a r e a .

C A M P A IG NA c o m m u n i c a t i o n p l a n t o d e l i v e r a m e s s a g e .

C H A N N E LT h e v e h i c l e b y w h i c h ap a r t y m a y i n t e r a c tw i t h t h e f i n a n c i a l i n s t i t u t i o n .

E V E N TS o m e t h i n g o f i n t e r e s t t h a t h a p p e n e d t h a t m a y o r m a yn o t i n v o l v e c o n t a c t w i t h t h e c u s t o m e r .

A G R E E M E N TA c o n t r a c t o r a n y t y p eo f a g r e e m e n t o f i n t e r e s t b e t w e e n P a r t i e s .

P A R T YA n i n d i v i d u a l , b u s i n e s s o r g r o u p o f i n d i v i d u a l so f i n t e r e s t t o t h e f i n a n c i a l i n s t i t u t i o n .

Data store

Data flows

CARDSDEPOSIT LEASING

Analytical services

Source systems

Data aquisition

LOAN GL...

CMFM RM CB

F I N A N C ET h e i n t e r n a l a c c o u n t i n go f t h e b u s i n e s s .

P A R T Y A S S E TT h i n g s p a r t i e s h a v e a n i n t e r e s t i nt h a t h a v e v a l u e .

P R O D U C TA n y m a r k e t a b l e p r o d u c to r s e r v i c e i n c l u d i n g t e r m s ,c o n d i t i o n s a n d f e a t u r e s .

I N T E R N A L O R G A N I Z A T IO NA P a r t y t h a t i s a u n i t o f b u s i n e s s .

L O C A T I O NA p h y s i c a l a d d r e s s ,e l e c t r o n i c a d d r e s s o r g e o g r a p h i c a l a r e a .

C A M P A IG NA c o m m u n i c a t i o n p l a n t o d e l i v e r a m e s s a g e .

C H A N N E LT h e v e h i c l e b y w h i c h ap a r t y m a y i n t e r a c tw i t h t h e f i n a n c i a l i n s t i t u t i o n .

E V E N TS o m e t h i n g o f i n t e r e s t t h a t h a p p e n e d t h a t m a y o r m a yn o t i n v o l v e c o n t a c t w i t h t h e c u s t o m e r .

A G R E E M E N TA c o n t r a c t o r a n y t y p eo f a g r e e m e n t o f i n t e r e s t b e t w e e n P a r t i e s .

P A R T YA n i n d i v i d u a l , b u s i n e s s o r g r o u p o f i n d i v i d u a l so f i n t e r e s t t o t h e f i n a n c i a l i n s t i t u t i o n .

Data storeCentral datastore

Data delivery

6

Page 7: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Swedbank Baltic DW

Swedbank Baltic Data Warehouse (EDW) is a subject oriented,

integrated, time-variant, non-volatile collection of enterprise data.

– Subject Oriented: Information is organized by subject areas instead of business line specific source system data structure. Subject areas are Party, Product, Agreement, Channel, Organization, Event etc.

– Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole under unified governance by using agreed dimesions, such as party Product, Agreement, Channel, Organisation etc.

– Time-variant: All data in the data warehouse is identified with a particular time period. DW stores history.

– Non-volatile: Data in the data warehouse is usually not over-written or deleted. Oncecommitted, the data is read-only, and retained for future reporting and analysis.

– Detailed: The granuality is detailed business events.

– Based on reference industry model: Teradata financial services logical datamodel.

7

Page 8: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Multiple usage of data warehouse

• Different business services have different requierements for– data availability frequency and timing

(e.g daily 6 am, daily 6 pm, monthly 1 day 8 am)

– data quality

(some services have near 0 tolerance to errors)

– performance and workload

8

Page 9: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank 9

Enterprise model (High level)

FINANCE ASSET

PARTY

Not all relationships are shown

LOCATIONA geographic or spatial area, physical address or electronic address.

An individual or group of individuals.

EVENT

Financial or non-financial event which may involve contact with the customer.

INTERNAL ORGANIZATION

A unit of business within the financial institution or insurance company. Is a type of Party.

AGREEMENT

A contract or deal between parties that is of interest.

PRODUCT

Any marketable or tradable product or service including terms and conditions.

CAMPAIGNA communication plan directed at parties or a market for a purpose.

CHANNEL

The vehicle by which a customer interacts with the Financial institution/insurance company.

The internal accountingof the business

The internal accountingof the business

Items that belong to parties and which have value.

Items that belong to parties and which have value.

9

Page 10: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Swedbank Baltic DW Statistics

External• 30 source systems (containing 1000 source objects)• 50 business services• 75 employees in Baltic DW

Internal• 20 Terabytes of storage (planned for 2012 50 TB)• 650 objects in main data store• 500 ETL processes• 4000 database objects• 40 database schemas in DW• 245 direct db db users, 500 reporting users

10

Page 11: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

How to manage

• everyday operations

• developement

• testing

• releasing

• migration (both technical and business)

• etl workflow optimisation

Answer: Using Enterprise metadata system needed

11

Page 12: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ETL is part of METATADA

Enterprise Metadata

12

Page 13: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ETL Software evaluation and POC 2007Rein Adamson – project manager

• Request for Proposal to 4 Vendors

• 2 Vendors selected for Proof of Concept (POC)– Oracle “ODI”

– Informatica “PowerCenter” (ETL market leader)

• POC budget 20 kEUR

• Evaluation process duration 5-8 months: – 2 m RFP and 2 Vendors selection for POC

– 4 m POC preparation

– 1 m POC action + results to management decision

– 1 m License and Implementation Contract with Winner

13

Page 14: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

POC- Proof Of Concept 2007

• POC budget 10 kEUR per Vendor included:

– 1 day system installation on bank IT infrastructure

– 2 days preparation before arrival (5 tasks sended)

– 5 days onsite consultant

• POC scope in 5 days with consultant:

– 1 day: Training to POC team ( 5 persons )

– 2,3,4 day: guidance to team for 5 ETL tasks development

– Last day: 2 hrs demo to IT managers

14

Page 15: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

POC Loading tasks scenarios

TASKS CONTENT:• Task 1 – Agreement loading (incl. Historisation)• Task 2 – Trigger filled to history table (incl.Country context)• Task 3 – Rows to Columns and vice versa • Task 4 – Aggregation within Teradata• Task 5 – Bank transactions(events) loading

– from 3 sources into 1 target, capacity perfomance test 7 million row

• 3 days to complete 5 ETL tasks • 1 task for each POC team member. Experienced DWH

specialists: developer, analyst, DBA, Admin, 2 architects• Consultant was a trainer to support our specialists

15

Page 16: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

KSF - Key Success Factors evaluated

• Reusability and standardization of loadings (high)

• Impact analysis on attribute level

• Resources for EDW services performance

• Release deployment and configuration

• Functionality of metadata repository (medium priority)

• Improve EDW development process

• EDW loading and calculation workflow management

• Faster analysis stage of development task

• Faster process and error maintenance (low priority)

16

Page 17: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

• ODI

• All the objects in ODI are reusable because of substitution method used.

• ELT Architecture supports today's skill sets

• Business and technical information has been separated from data load logic.

• 2,8 points out of 3

• INFORMATICA

• Templates are fixed source/target templates. Technical options are integrated with business logic.

• It is possible to create reusable components but, while doing tasks it was clear that at one point it easier to start from blank page....

• 1,8 points out of 3

Reusability and standardization of loading patterns.Flexibility of loading templates. Customizable, but robust. Target is to shorten time of development by reusing excisting patterns.

17

Page 18: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Release deployment and configurationTime and understanding of maintenance and deployment new loadingprocedures. Easier and faster release management.

ODI

• Topology is transparent and easily understandable,

• Monitoring is at necessary detail level together with debugging,

• No additional environments needed, information is moving between repositories only,

• Versioning with install/rollback functionality is available.

• ...

• 2,6 points out 3

IFA

• Topology is not clear and transparent

• Release complexity can grow to estimations where it is comparable to today's situation,

• Monitoring and debugging is available at high level until steps have been completed, no intermediate access,

• Country based approach is not supported in central repository.

• 1,8 points out of 318

Page 19: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

POC results summary comment

– ODI utilizes the existing infrastructure. There is no (new) proprietary transformation server/database. This tool is utilizing Source and Target database engine and their tools to unload/load data and transform the data. It is transparent. No need for highly new skills and more specialists.

– Informatica brings in totally new technology, additional specialists needed, more trainings and consultancy to buy.

19

Page 20: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

KSF evaluation points (max 3)

1,0 1,5 2,0 2,5 3,0

1.Reusability and standardisation ofloadings

2.Impact analysis on attribute level

3.Resources for EDW servicesperfomance

4.Release deployment and configuration

5.Functionality of metadata repository

6.Improve EDW development process

7.EDW loading and calculation workflowmanagement

8.Faster analysis stage of developmenttask

9.Faster process and data errormaintenance

ODI IFA

20

Page 21: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ODI implementation 2007sept - 2008 sept

• Oracle ODI partner consultancy used– 1 standard training in 4 days , 10 persons in class– 1 onsite visit in 2 days (consultant from Italy)– 5 days off-site consultancy during 3 months (Poland)– 5 Oracle support cases

• Customer resource – 1 experienced ETL developer assigned 100% in 1 year–

• Custom solutions design and implementation:– ETL Process registry design and development (2 months duration)– Common Wrapper development (3 months)– Process Registry and Common Wrapper testing, debugging (2 m)– ODI release process procedures implementation (2 m)

21

Page 22: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

83 active ODI Users today

• 59 users in EDW (71%), 22 users in CRM area (27%)

• 35 Analyst-Developers; 16 SQA-s. Dev+SQA=61%

0 5 10 15 20 25 30 35 40

Developer

SQA

Service Manager

Implementator

other manager

App.admin

Sys.admin-DBA

CRM

EDW

LOANS

22

Page 23: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Oracle Data Integrator

Page 24: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Oracle Data Integrator

• Oracle Data Integrator is a comprehensive

• data integration platform that covers all data integration requirements from high-volume, high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services.

24

ODI is Oracle’s Strategic Product for Data Integration• Heterogeneous E-LT Architecture• Optimized Connectivity Architecture• Modular Implementation Architecture• SOA-Native Architecture

Page 25: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ODI Component Architecture

25

Page 26: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Repository Set-Up Pattern

Models

Projects

Execution

Work Repository(Development)

Models

Projects

Execution

Work Repository(Test & QA)

Execution

Execution Repository(Production)

Security

Topology

Versioning

MasterRepository

Create and archive versions of models, projects and scenarios

Import released versions of models, projects and scenarios for testing

Import released and tested versions of scenarios for production

Development – Test – Production Cycle26

Page 27: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

E+LT approach

27

Page 28: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

CL_ PARTY

Pa rty _Id: INTEG ER

In dividua l_Or gan izat ion_ Code : SMALL INTL ifecy cle_ Code : SMALL INTPr ima ry_ Host _Cus tomer _Nbr : VARCHAR( 20)Pr ima ry_ Host _Id: SMALLINTFull_ Name: VARCHAR(24 0)Sh ort _Name: VARCHAR(7 0)Firs t_Na me : VARCHAR( 70)Midd le_Na me : VARCHAR( 70)L ast _Name: VARCHAR(7 0)Cu sto me r_Re side ncy _Cod e: SMAL LINTId ent ificat ion_ Nbr: VARCHAR(2 0)Pa rty _Sta rt_ Date : DATERe side ncy _Cou ntr y_G eog _Are a_Id : INTEGERBir th_ Date : DATEL ega l_Reg istr atio n_Da te: DATECu sto me r_Type _Cod e: SMAL LINTAd dre ss_ Use_ Code : SMALL INTAd dre ss_ Line : VARCHAR( 140 )Cit y_Na me : VARCHAR( 30)Po sta l_Cod e: VARCHAR(20 )Ph one _Nbr _1: VARCHAR(2 0)Ph one _Nbr _2: VARCHAR(2 0)Ele ctr onic _Add res s: VARCHAR(50 )Man age r_Pa rty _Id: INTEG ERFax _Nbr : VARCHAR( 20)Cit y_G eog _Are a_Id : INTEGERSt ate _Ge og_ Area _Id: INTEG ERSe gment _Id: INTEG ERAffilia tion _Seg me nt_ Id: INTEGERAffilia tion _Par ty_ Id: INTEGERHo me bra nch _Cha nne l_Id: INTEG ERSIC_ Code : VARCHAR( 10)SIC_ Gro up_ Code : SMALL INTL ega l_Str uct ure _Cod e: SMAL LINTEmplo yee s_Cn t: INTEGERSy ste m_ Abus e_Type _Cod e: SMAL LINTL ang uag e_De mo g_Va lue_ Id: INTEGEREd uca tion _Demog _Valu e_Id : INTEGERSo cial_ Stat us_ Demog_ Value _Id: INTEG ERMar ital_ Stat us_ Demog_ Value _Id: INTEG ERDe pen dan ts_ Cnt: INTEG ERPa ren t_In ter nal_ Org _Par ty_ Id: INTEGERPa rty _Cha nge _Dtim e: TIMESTAMP(0 )Pa rty _Cha nge _Lo ad_ Dtim e: TIMESTAMP(0)Bir th_ Coun try _Ge og_ Area _Id: INTEG ERG end er_ Code : CHAR( 1)Pa rty _Sta tus : SMALL INT

CL_CO NTRACT

Acc oun t_Nb r: VARCHAR(35 )Acc oun t_Nb r_Mod ifier: SMALLINT

Acc oun t_Type _Cod e: SMAL LINTPro duc t_Id : INTEGERAcc oun t_Cu rre ncy _Cod e: CHAR(3)Acc oun t_Pr odu ct_ Typ e_Co de: SMALLINTAcc t_St atu s_Type _Cod e: SMAL LINTAcc oun t_Re gist rat ion_ Date : DATEAcc oun t_Sig n_Da te: DATEAcc oun t_O pen _Dat e: DATEAcc oun t_Mat urit y_Da te: DATEAcc oun t_Clo sing _Dat e: DATEAcc oun t_Na me : VARCHAR( 100 )Own er_ Part y_Id : INTEGERQu ota tion _Id: INTEG ERPor tfolio _Cha nne l_Id: INTEG ERAffiliat ion_ Part y_Id : INTEGERMana ger _Par ty_ Id: INTEGERApp licat ion_ Ope n_Da te: DATEOp en_ Chan nel_ Id: INTEGEROp en_ Part y_Id : INTEGEROp en_ User _Cod e: VARCHAR(16 )Hint er_ Part y_Id : INTEGERSelle r_Pa rty _ID: INTEGERGr oup _Acc oun t_Ch ild_In d: CHAR(1)Con tra ct_ Stat us_ Typ e_Co de: SMALLINTCur ren t_Ac cou nt_ Nbr: VARCHAR(3 5)Cur ren t_Ac cou nt_ Nbr_ Mo difier : SMALL INTPro duc t_Pa ram1_ Code : INTEGERPro duc t_Pa ram2_ Code : INTEGERPro duc t_Pa ram3_ Code : INTEGERAcc oun t_Ch ang e_Dt ime : TIMESTAMP( 0)Acc oun t_Ch ang e_L oad _Dtim e: TIMESTAMP(0 )MIS_Pro duc t_Id : INTEGERInt ere st_ Rate _Pct : DECIMAL( 8,3 )Bas e_Ra te_ Pct: DECIMAL(8 ,3)Int ere st_ Inde x_Co de: SMALLINT

CL _BANK_ACCO UNT

Acco unt _Nbr : VARCHAR( 35)Acco unt _Nbr _Modifie r: SMAL LINT

Acco unt _Cur ren cy_ Code : CHAR( 3)Acco unt _Pro duc t_Type _Cod e: SMAL LINTAcct _Sta tus _Ty pe_ Code : SMALL INTAcco unt _Reg istr atio n_Da te: DATEAcco unt _Op en_ Date : DATEAcco unt _Matu rity _Dat e: DATEAcco unt _Clos ing_ Date : DATEOwne r_Pa rty _Id: INTEG ERMa nag er_ Part y_Id : INTEGEROpe n_Pa rty _Id: INTEG EROpe n_Ch ann el_Id : INTEGEROpe n_Us er_ Code : VARCHAR( 16)Acco unt _Cha nge _Dtim e: TIMESTAMP(0 )Acco unt _Cha nge _Lo ad_ Dtim e: TIMESTAMP(0)Las t_Re newa l_Dat e: DATETer m_ Perio d_Co de: SMALLINTTer m_ Perio d_Va lue: INTEG ERDepo sit_ Inte res t_Ra te: DECIMAL(8 ,3)Actu al_In ter est _Rat e: DECIMAL (8, 3)Depo sit_ Inte res t_Amt: DECIMAL(1 8,2 )Depo sit_ Acco unt _Amt: DECIMAL (18 ,2)Actu al_De pos it_Amt: DECIMAL(1 8,2 )Auto _Pro long _Ind : SMALL INTAuto _Pro long _Per iod_ Code : SMALL INTAuto _Pro long _Per iod_ Value : SMALL INTAuto _Pro long _End _Dat e: DATEPremat ure _Te rminat ion_ Ind: SMALLINTPremat ure _Te rminat ion_ Rate _Ind : SMALL INTInte res t_Ca lc_Met hod _Cod e: SMAL LINTInte res t_Ac cou nt_ Nbr: VARCHAR(3 5)Inte res t_Ac cou nt_ Nbr_ Mo difier : SMALL INTFu nd_ Rate _Pct : DECIMAL( 16, 9)Affiliatio n_Pa rty _Id: INTEG ERGro up_ Acco unt _Child _ind : CHAR( 1)Cont rac t_St atu s_Type _Cod e: SMAL LINTData _Valid atio n_Re sult _Cod e: SMAL LINTProd uct _id: INTEG ERPort folio_ Cahn nel_ Id: INTEGERMis _Pro duc t_Id : INTEGERPort folio_ Chan nel_ Id: INTEGERDepo sit_ Rene wed_ Ind: CHAR(1 )Addit iona l_Int ere st_ Rate : DECIMAL( 8,3 )Inte res t_Dis bm_Ty pe_ Code : SMALL INTDepo sit_ Ter min atio n_Ra te: DECIMAL(8 ,3)Curr enc y_Co nv_Ind : CHAR( 1)Invest me nt_ Prod uct _id: SMALLINT

CUSTOMER

CUSTOMER NUMBERCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX

ORDER

ORDER NUMBERORDER DATESTATUS

ORDER ITEM BACKORDERED

QUANTITY

ITEM

ITEM NUMBERQUANTITYDESCRIPTION

ORDER ITEM SHIPPED

QUANTITYSHIP DATE

R/370

R/372

R/374R/378 R/379

HOST_PARTY

Host_IDIdentification_Nbr

HOST_PARTY_IDENTIFICATION_HISTORY

Host_ID (FK)Identification_Nbr (FK)Start_Date

Master_Party_ID (FK)End_Date

MASTER_PARTY

Master_Party_ID

HOST_PARTY_ACCOUNT

Host_ID (FK)Identification_Nbr (FK)Account_Nbr (FK)Account_Nbr_Modifier (FK)Start_Date

End_Date

HOST_PARTY_RELATION

Host_ID (FK)Identification_Nbr (FK)Related_Identification_Nbr (FK)Related_Host_Id (FK)Start_Date

End_Date

28

Page 29: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

PHYSICAL SERVERS - PRODUCTION

ODI Topology usage example

ODI Server Name: PROD_CORE_EEServer Name: TALLINN (LDAP)

Schema: CARD

ODI Server Name: PROD_CORE_LVServer Name: RIGA (LDAP)

Schema: CARD

ODI Server Name: PROD_DW_GRServer Name: EDW.DOMAIN.EE (IP)

Schema: MAIN

LOGICAL SCHEMAS

CONTEXT: PROD_EE

Logical Schema: CORE_CARD Logical Schema: DW_MAIN

CONTEXT: PROD_LV CONTEXT: PROD_EECONTEXT: PROD_LVCONTEXT: PROD_GR

• Logical schema is mapped thru Context to Physical Server and Physical Schema

29

Page 30: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Features of ODI topology

• Physical server has fixed user name and password• One logical schema can map to exactly one physical

schema in one contextTo make multiple users in same database – define more contexts or duplicate

the datamodel

• Logical schema cannot change technology

Conclusion – database schema is needed to be defined as many times as many database users have

Single shared database connection is preferred to maximize ELT –> compromise on resource management on database side by user names

Page 31: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ODI developer basic steps

1. Reverse engineer data models from source and target

2. Define column level data mappings, specify join and filter conditions. Every data mapping (odi interface) can have exactly one target and

multiple sources

3. Select knowledge module (code generator)

4. Generate code (odi scenario) and execute scenario

31

Page 32: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

ODI AgentODI Designer

ODI scenario generation and execution

Package Scenario

Knowledge modules

Interfaces

Data Objects

Code Generation

Code Execution

Context(Topology)

Runtime variables

• When knowledge module changes – rebuild and deploy all related scenarios• When database objects change – refresh data structure definitions from source database, rebuild

and deploy all related scenarios

DB 1

DB 2

Connect & executecommands

Connect & execute

commands

32

Page 33: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Custom components to manage 500 ETL processes

• Process registry– all processes and their dependencies

• Common wrapper – special scenario wrapping all others

• ODI monitor– Web access to process registry

• Release builder– Used for deploying from test to developement

33

Page 34: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Process registry

• List of all ETL processes regardless of technology- Create, change, retire process- All necessary information for maintaining the list

• Process scheduling information• Dependencies between processes

– Process to process dependencies– Dependencies thru “Dependency Group”– Based on process bookmarks

Page 35: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Common Wrapper

• Special 1 instance ODI scenario, thru which all other scenariosare executed (pre and post steps)

• Implements common functionality needed for all processes- Checks if preliminaries of process have been filled

- Checks if process allowed to run at the moment.

- Assigns common process control variables and passes its values to executed scenario

- Logs execution bookmarks, odi session ids, run result

- Alerts monitoring in case of failure

Page 36: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Custom components overview

Page 37: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Dependency group

• Defining dependency group - is the data content what process delivers. It corresponds to business concept / subject area + data availability.

• Proceses are either:– Suppliers of Dependency group– Consumers of Dependency group

• Dependency groups are also used for show the data availability bookmarks for users in ad-hoc reporting environement

37

Page 38: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Dependencies between processes

Credit Agrmt Dly

Loan Agrmt loadingBank Account loading

Fin Agrmt Dly Fin Agrmt Bal Dly

Value added calculation 1

Is Supplier for

Is Consumer of

Factoring Agrmt Loading

Is Supplier forIs Supplier for

Leasing Agrmt Loading

Value added calculation 2

Is Consumer ofIs Consumer of

Consuming

processes

Dependency

Groups

Supplying

processes38

Page 39: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

METADATAUSER INTERFACE

TECHNICAL OPERATIONAL METADATA

Enterprise metadata context

RDBMS metadataCASE tool

(Logical data models)

Transformationmetadata

(ETL tools)Presentation metadata

(Reporting tools)

EnterpriseMetadataRepository

Manual Metadata(Services, business requierments

etc)Metadata reports

39

Page 40: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

PROCESS

Process_Name

Service_Component_ShortName (FK)ETL_Server_NameProcess_Status_ShortNameProcess_Executable_Name (FK)

PROCESS_EXECUTABLE

Process_Executable_Name

Package_Name (FK)

PACKAGE

Package_Name

PACKAGE_SOURCE_OBJECT

Package_Name (FK)DB_Object_Name (FK)Impact_Layer_Name (FK)

PACKAGE_TARGET_OBJECT

Package_Name (FK)DB_Object_Name (FK)Impact_Layer_Name (FK)

PROCESS_PARAM

Process_Name (FK)Param_Name

Param_Value

PROCESS_DEPENDENCY

Process_Name (FK)Dependency_Group_Name (FK)

DEPENDENCY_GROUP

Dependency_Group_Name

PACKAGE_SOURCE_LAYER

Package_Name (FK)Impact_Layer_Name (FK)

PACKAGE_TARGET_LAYER

Package_Name (FK)Impact_Layer_Name (FK)

DB_OBJECT

Impact_Layer_Name (FK)DB_Object_Name

Service_Component_ShortName (FK)

IMPACT_LAYER

Impact_Layer_Name

Service_Component_ShortName (FK)

PROCESS_RUN

Process_Name (FK)Process_Execution_Dtime

Process_Boomark_Values

PROCESS_SCHEDULE_TIME

Process_Schedule_Type_CodeProcess_Schedule_NoProcess_Name (FK)

Frequency_TypeFrequency_Value

SERVICE

Service_Component_ShortName

Metadata model – ETL related

MANUAL CONFIGURATION

TRANSFORMATION

PROCESS REGISTRY

Sources of metadata:

40

RDBMS

Page 41: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Process execution preliminaries

41

Page 42: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Process execution Daemon

• Planned component for automatic ETL workflow management, start process when:– It is time to process new data

– Preliminaries are ready

– Process run is allowed

• Replacement of enterprise job scheduler

• Utilizing framework of Process Registry and CommonWrapper

Page 43: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Our experience with ODI (10g)

• Performance concerns– Educate developers to use existing patterns– Optimize knowledge modules, while keeping them as generic as possible– Made lightweight quick web application for accessing execution logs

• Functionality– Modified almost every KM which is now in use– Created new KMs for common needs (new history integration, SAX xml

parsing for loadings, streamed xml output etc.)– Made workarounds for missing features: OLAP function support, sub queries– Utilized ODI code substitution framework to maximum– Made command line utility to start ODI session on remote Agent– Use DTS Agent for scheduling – single high-level workflow management

system

Page 44: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Our experience with ODI (10g) , continued

• Deployment– We use single ODI project per Area – shared sets of KMs and Variables– To test – install separately changed data models, knowledge modules and odi

folders (common releasable unit, based on custom export script)– Huge ODI project import operation required custom solution to do incremental

restore for whole project.• ETL Administrator concerns

– no way to change the code directly in production (in case of urgent issues)

Page 45: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

2008 started ETL processes migration from MS-DTS to ODI . Current status:

Number of ETL processes

261 257

0

50

100

150

200

250

300

ODI DTS

Number of tasks in ETL processes

3819

3224DTS

ODI

45

Page 46: Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

© Swedbank

Questions?

46


Recommended