Date post: | 04-Dec-2014 |
Category: |
Technology |
Upload: | oracle-user-group-estonia |
View: | 3,145 times |
Download: | 2 times |
© Swedbank
Mart Tudre – Swedbank Baltic DW architect
Rein Adamson – Project Manager
Oracle Data Integrator ETL software in Swedbank EDW 2007 – 2011
© Swedbank
Agenda
• EDW - Enterprise Data Warehouse– EDW, BI definitions
– Swedbank Baltic DW - general facts
• ETL software evaluation 2007– ETL Software evaluation and Proof of Concept 2007
– ODI Implementation project
– User roles today
• ODI implementation in Swedbank Baltic DW – ODI defining features
– Usage specifics and custom components
2
© Swedbank
Data WareHouse – a definition
• A data warehouse is a repository of an organization's electronically stored data, designed to facilitate reporting and analysis.
• An expanded definition for data warehousing includes tools for
– business intelligence
– extracting, transforming and loading data into the repository
– to manage and retrieve metadata.
Business intelligence - computer-based techniques used in spotting, digging-out, and analyzing business data
Source: wikipedia.org
ETL – Extract, Transform, Load
EDW – Enterprise Data Warehouse (also IT org.unit in Swedbank)
3
© Swedbank
Business Intelligence functions
• predictive analytics (statistics, data mining)
• online analytical processing (OLAP)
• business performance management
• benchmarking
• text mining
• reporting
4
© Swedbank
Data Warehouse architecture
Operational Data
Data Transformation
Enterprise Data Warehouse, Integrated Data Marts
Replication
Source Business Users
Analytical Users
5
© Swedbank
CARDSDEPOSIT LEASINGLOAN GL...
CMFM RM CB
Data delivery
F I N A N C ET h e i n t e r n a l a c c o u n t i n go f t h e b u s i n e s s .
P A R T Y A S S E TT h i n g s p a r t i e s h a v e a n i n t e r e s t i nt h a t h a v e v a l u e .
P R O D U C TA n y m a r k e t a b l e p r o d u c to r s e r v i c e i n c l u d i n g t e r m s ,c o n d i t i o n s a n d f e a t u r e s .
I N T E R N A L O R G A N I Z A T IO NA P a r t y t h a t i s a u n i t o f b u s i n e s s .
L O C A T I O NA p h y s i c a l a d d r e s s ,e l e c t r o n i c a d d r e s s o r g e o g r a p h i c a l a r e a .
C A M P A IG NA c o m m u n i c a t i o n p l a n t o d e l i v e r a m e s s a g e .
C H A N N E LT h e v e h i c l e b y w h i c h ap a r t y m a y i n t e r a c tw i t h t h e f i n a n c i a l i n s t i t u t i o n .
E V E N TS o m e t h i n g o f i n t e r e s t t h a t h a p p e n e d t h a t m a y o r m a yn o t i n v o l v e c o n t a c t w i t h t h e c u s t o m e r .
A G R E E M E N TA c o n t r a c t o r a n y t y p eo f a g r e e m e n t o f i n t e r e s t b e t w e e n P a r t i e s .
P A R T YA n i n d i v i d u a l , b u s i n e s s o r g r o u p o f i n d i v i d u a l so f i n t e r e s t t o t h e f i n a n c i a l i n s t i t u t i o n .
Data store
Data flows
CARDSDEPOSIT LEASING
Analytical services
Source systems
Data aquisition
LOAN GL...
CMFM RM CB
F I N A N C ET h e i n t e r n a l a c c o u n t i n go f t h e b u s i n e s s .
P A R T Y A S S E TT h i n g s p a r t i e s h a v e a n i n t e r e s t i nt h a t h a v e v a l u e .
P R O D U C TA n y m a r k e t a b l e p r o d u c to r s e r v i c e i n c l u d i n g t e r m s ,c o n d i t i o n s a n d f e a t u r e s .
I N T E R N A L O R G A N I Z A T IO NA P a r t y t h a t i s a u n i t o f b u s i n e s s .
L O C A T I O NA p h y s i c a l a d d r e s s ,e l e c t r o n i c a d d r e s s o r g e o g r a p h i c a l a r e a .
C A M P A IG NA c o m m u n i c a t i o n p l a n t o d e l i v e r a m e s s a g e .
C H A N N E LT h e v e h i c l e b y w h i c h ap a r t y m a y i n t e r a c tw i t h t h e f i n a n c i a l i n s t i t u t i o n .
E V E N TS o m e t h i n g o f i n t e r e s t t h a t h a p p e n e d t h a t m a y o r m a yn o t i n v o l v e c o n t a c t w i t h t h e c u s t o m e r .
A G R E E M E N TA c o n t r a c t o r a n y t y p eo f a g r e e m e n t o f i n t e r e s t b e t w e e n P a r t i e s .
P A R T YA n i n d i v i d u a l , b u s i n e s s o r g r o u p o f i n d i v i d u a l so f i n t e r e s t t o t h e f i n a n c i a l i n s t i t u t i o n .
Data storeCentral datastore
Data delivery
6
© Swedbank
Swedbank Baltic DW
Swedbank Baltic Data Warehouse (EDW) is a subject oriented,
integrated, time-variant, non-volatile collection of enterprise data.
– Subject Oriented: Information is organized by subject areas instead of business line specific source system data structure. Subject areas are Party, Product, Agreement, Channel, Organization, Event etc.
– Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole under unified governance by using agreed dimesions, such as party Product, Agreement, Channel, Organisation etc.
– Time-variant: All data in the data warehouse is identified with a particular time period. DW stores history.
– Non-volatile: Data in the data warehouse is usually not over-written or deleted. Oncecommitted, the data is read-only, and retained for future reporting and analysis.
– Detailed: The granuality is detailed business events.
– Based on reference industry model: Teradata financial services logical datamodel.
7
© Swedbank
Multiple usage of data warehouse
• Different business services have different requierements for– data availability frequency and timing
(e.g daily 6 am, daily 6 pm, monthly 1 day 8 am)
– data quality
(some services have near 0 tolerance to errors)
– performance and workload
8
© Swedbank 9
Enterprise model (High level)
FINANCE ASSET
PARTY
Not all relationships are shown
LOCATIONA geographic or spatial area, physical address or electronic address.
An individual or group of individuals.
EVENT
Financial or non-financial event which may involve contact with the customer.
INTERNAL ORGANIZATION
A unit of business within the financial institution or insurance company. Is a type of Party.
AGREEMENT
A contract or deal between parties that is of interest.
PRODUCT
Any marketable or tradable product or service including terms and conditions.
CAMPAIGNA communication plan directed at parties or a market for a purpose.
CHANNEL
The vehicle by which a customer interacts with the Financial institution/insurance company.
The internal accountingof the business
The internal accountingof the business
Items that belong to parties and which have value.
Items that belong to parties and which have value.
9
© Swedbank
Swedbank Baltic DW Statistics
External• 30 source systems (containing 1000 source objects)• 50 business services• 75 employees in Baltic DW
Internal• 20 Terabytes of storage (planned for 2012 50 TB)• 650 objects in main data store• 500 ETL processes• 4000 database objects• 40 database schemas in DW• 245 direct db db users, 500 reporting users
10
© Swedbank
How to manage
• everyday operations
• developement
• testing
• releasing
• migration (both technical and business)
• etl workflow optimisation
Answer: Using Enterprise metadata system needed
11
© Swedbank
ETL is part of METATADA
Enterprise Metadata
12
© Swedbank
ETL Software evaluation and POC 2007Rein Adamson – project manager
• Request for Proposal to 4 Vendors
• 2 Vendors selected for Proof of Concept (POC)– Oracle “ODI”
– Informatica “PowerCenter” (ETL market leader)
• POC budget 20 kEUR
• Evaluation process duration 5-8 months: – 2 m RFP and 2 Vendors selection for POC
– 4 m POC preparation
– 1 m POC action + results to management decision
– 1 m License and Implementation Contract with Winner
13
© Swedbank
POC- Proof Of Concept 2007
• POC budget 10 kEUR per Vendor included:
– 1 day system installation on bank IT infrastructure
– 2 days preparation before arrival (5 tasks sended)
– 5 days onsite consultant
• POC scope in 5 days with consultant:
– 1 day: Training to POC team ( 5 persons )
– 2,3,4 day: guidance to team for 5 ETL tasks development
– Last day: 2 hrs demo to IT managers
14
© Swedbank
POC Loading tasks scenarios
TASKS CONTENT:• Task 1 – Agreement loading (incl. Historisation)• Task 2 – Trigger filled to history table (incl.Country context)• Task 3 – Rows to Columns and vice versa • Task 4 – Aggregation within Teradata• Task 5 – Bank transactions(events) loading
– from 3 sources into 1 target, capacity perfomance test 7 million row
• 3 days to complete 5 ETL tasks • 1 task for each POC team member. Experienced DWH
specialists: developer, analyst, DBA, Admin, 2 architects• Consultant was a trainer to support our specialists
15
© Swedbank
KSF - Key Success Factors evaluated
• Reusability and standardization of loadings (high)
• Impact analysis on attribute level
• Resources for EDW services performance
• Release deployment and configuration
• Functionality of metadata repository (medium priority)
• Improve EDW development process
• EDW loading and calculation workflow management
• Faster analysis stage of development task
• Faster process and error maintenance (low priority)
16
© Swedbank
• ODI
• All the objects in ODI are reusable because of substitution method used.
• ELT Architecture supports today's skill sets
• Business and technical information has been separated from data load logic.
• 2,8 points out of 3
• INFORMATICA
• Templates are fixed source/target templates. Technical options are integrated with business logic.
• It is possible to create reusable components but, while doing tasks it was clear that at one point it easier to start from blank page....
• 1,8 points out of 3
Reusability and standardization of loading patterns.Flexibility of loading templates. Customizable, but robust. Target is to shorten time of development by reusing excisting patterns.
17
© Swedbank
Release deployment and configurationTime and understanding of maintenance and deployment new loadingprocedures. Easier and faster release management.
ODI
• Topology is transparent and easily understandable,
• Monitoring is at necessary detail level together with debugging,
• No additional environments needed, information is moving between repositories only,
• Versioning with install/rollback functionality is available.
• ...
• 2,6 points out 3
IFA
• Topology is not clear and transparent
• Release complexity can grow to estimations where it is comparable to today's situation,
• Monitoring and debugging is available at high level until steps have been completed, no intermediate access,
• Country based approach is not supported in central repository.
• 1,8 points out of 318
© Swedbank
POC results summary comment
– ODI utilizes the existing infrastructure. There is no (new) proprietary transformation server/database. This tool is utilizing Source and Target database engine and their tools to unload/load data and transform the data. It is transparent. No need for highly new skills and more specialists.
– Informatica brings in totally new technology, additional specialists needed, more trainings and consultancy to buy.
19
© Swedbank
KSF evaluation points (max 3)
1,0 1,5 2,0 2,5 3,0
1.Reusability and standardisation ofloadings
2.Impact analysis on attribute level
3.Resources for EDW servicesperfomance
4.Release deployment and configuration
5.Functionality of metadata repository
6.Improve EDW development process
7.EDW loading and calculation workflowmanagement
8.Faster analysis stage of developmenttask
9.Faster process and data errormaintenance
ODI IFA
20
© Swedbank
ODI implementation 2007sept - 2008 sept
• Oracle ODI partner consultancy used– 1 standard training in 4 days , 10 persons in class– 1 onsite visit in 2 days (consultant from Italy)– 5 days off-site consultancy during 3 months (Poland)– 5 Oracle support cases
• Customer resource – 1 experienced ETL developer assigned 100% in 1 year–
• Custom solutions design and implementation:– ETL Process registry design and development (2 months duration)– Common Wrapper development (3 months)– Process Registry and Common Wrapper testing, debugging (2 m)– ODI release process procedures implementation (2 m)
21
© Swedbank
83 active ODI Users today
• 59 users in EDW (71%), 22 users in CRM area (27%)
• 35 Analyst-Developers; 16 SQA-s. Dev+SQA=61%
0 5 10 15 20 25 30 35 40
Developer
SQA
Service Manager
Implementator
other manager
App.admin
Sys.admin-DBA
CRM
EDW
LOANS
22
© Swedbank
Oracle Data Integrator
© Swedbank
Oracle Data Integrator
• Oracle Data Integrator is a comprehensive
• data integration platform that covers all data integration requirements from high-volume, high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services.
24
ODI is Oracle’s Strategic Product for Data Integration• Heterogeneous E-LT Architecture• Optimized Connectivity Architecture• Modular Implementation Architecture• SOA-Native Architecture
© Swedbank
ODI Component Architecture
25
© Swedbank
Repository Set-Up Pattern
Models
Projects
Execution
Work Repository(Development)
Models
Projects
Execution
Work Repository(Test & QA)
Execution
Execution Repository(Production)
Security
Topology
Versioning
MasterRepository
Create and archive versions of models, projects and scenarios
Import released versions of models, projects and scenarios for testing
Import released and tested versions of scenarios for production
Development – Test – Production Cycle26
© Swedbank
E+LT approach
27
© Swedbank
CL_ PARTY
Pa rty _Id: INTEG ER
In dividua l_Or gan izat ion_ Code : SMALL INTL ifecy cle_ Code : SMALL INTPr ima ry_ Host _Cus tomer _Nbr : VARCHAR( 20)Pr ima ry_ Host _Id: SMALLINTFull_ Name: VARCHAR(24 0)Sh ort _Name: VARCHAR(7 0)Firs t_Na me : VARCHAR( 70)Midd le_Na me : VARCHAR( 70)L ast _Name: VARCHAR(7 0)Cu sto me r_Re side ncy _Cod e: SMAL LINTId ent ificat ion_ Nbr: VARCHAR(2 0)Pa rty _Sta rt_ Date : DATERe side ncy _Cou ntr y_G eog _Are a_Id : INTEGERBir th_ Date : DATEL ega l_Reg istr atio n_Da te: DATECu sto me r_Type _Cod e: SMAL LINTAd dre ss_ Use_ Code : SMALL INTAd dre ss_ Line : VARCHAR( 140 )Cit y_Na me : VARCHAR( 30)Po sta l_Cod e: VARCHAR(20 )Ph one _Nbr _1: VARCHAR(2 0)Ph one _Nbr _2: VARCHAR(2 0)Ele ctr onic _Add res s: VARCHAR(50 )Man age r_Pa rty _Id: INTEG ERFax _Nbr : VARCHAR( 20)Cit y_G eog _Are a_Id : INTEGERSt ate _Ge og_ Area _Id: INTEG ERSe gment _Id: INTEG ERAffilia tion _Seg me nt_ Id: INTEGERAffilia tion _Par ty_ Id: INTEGERHo me bra nch _Cha nne l_Id: INTEG ERSIC_ Code : VARCHAR( 10)SIC_ Gro up_ Code : SMALL INTL ega l_Str uct ure _Cod e: SMAL LINTEmplo yee s_Cn t: INTEGERSy ste m_ Abus e_Type _Cod e: SMAL LINTL ang uag e_De mo g_Va lue_ Id: INTEGEREd uca tion _Demog _Valu e_Id : INTEGERSo cial_ Stat us_ Demog_ Value _Id: INTEG ERMar ital_ Stat us_ Demog_ Value _Id: INTEG ERDe pen dan ts_ Cnt: INTEG ERPa ren t_In ter nal_ Org _Par ty_ Id: INTEGERPa rty _Cha nge _Dtim e: TIMESTAMP(0 )Pa rty _Cha nge _Lo ad_ Dtim e: TIMESTAMP(0)Bir th_ Coun try _Ge og_ Area _Id: INTEG ERG end er_ Code : CHAR( 1)Pa rty _Sta tus : SMALL INT
CL_CO NTRACT
Acc oun t_Nb r: VARCHAR(35 )Acc oun t_Nb r_Mod ifier: SMALLINT
Acc oun t_Type _Cod e: SMAL LINTPro duc t_Id : INTEGERAcc oun t_Cu rre ncy _Cod e: CHAR(3)Acc oun t_Pr odu ct_ Typ e_Co de: SMALLINTAcc t_St atu s_Type _Cod e: SMAL LINTAcc oun t_Re gist rat ion_ Date : DATEAcc oun t_Sig n_Da te: DATEAcc oun t_O pen _Dat e: DATEAcc oun t_Mat urit y_Da te: DATEAcc oun t_Clo sing _Dat e: DATEAcc oun t_Na me : VARCHAR( 100 )Own er_ Part y_Id : INTEGERQu ota tion _Id: INTEG ERPor tfolio _Cha nne l_Id: INTEG ERAffiliat ion_ Part y_Id : INTEGERMana ger _Par ty_ Id: INTEGERApp licat ion_ Ope n_Da te: DATEOp en_ Chan nel_ Id: INTEGEROp en_ Part y_Id : INTEGEROp en_ User _Cod e: VARCHAR(16 )Hint er_ Part y_Id : INTEGERSelle r_Pa rty _ID: INTEGERGr oup _Acc oun t_Ch ild_In d: CHAR(1)Con tra ct_ Stat us_ Typ e_Co de: SMALLINTCur ren t_Ac cou nt_ Nbr: VARCHAR(3 5)Cur ren t_Ac cou nt_ Nbr_ Mo difier : SMALL INTPro duc t_Pa ram1_ Code : INTEGERPro duc t_Pa ram2_ Code : INTEGERPro duc t_Pa ram3_ Code : INTEGERAcc oun t_Ch ang e_Dt ime : TIMESTAMP( 0)Acc oun t_Ch ang e_L oad _Dtim e: TIMESTAMP(0 )MIS_Pro duc t_Id : INTEGERInt ere st_ Rate _Pct : DECIMAL( 8,3 )Bas e_Ra te_ Pct: DECIMAL(8 ,3)Int ere st_ Inde x_Co de: SMALLINT
CL _BANK_ACCO UNT
Acco unt _Nbr : VARCHAR( 35)Acco unt _Nbr _Modifie r: SMAL LINT
Acco unt _Cur ren cy_ Code : CHAR( 3)Acco unt _Pro duc t_Type _Cod e: SMAL LINTAcct _Sta tus _Ty pe_ Code : SMALL INTAcco unt _Reg istr atio n_Da te: DATEAcco unt _Op en_ Date : DATEAcco unt _Matu rity _Dat e: DATEAcco unt _Clos ing_ Date : DATEOwne r_Pa rty _Id: INTEG ERMa nag er_ Part y_Id : INTEGEROpe n_Pa rty _Id: INTEG EROpe n_Ch ann el_Id : INTEGEROpe n_Us er_ Code : VARCHAR( 16)Acco unt _Cha nge _Dtim e: TIMESTAMP(0 )Acco unt _Cha nge _Lo ad_ Dtim e: TIMESTAMP(0)Las t_Re newa l_Dat e: DATETer m_ Perio d_Co de: SMALLINTTer m_ Perio d_Va lue: INTEG ERDepo sit_ Inte res t_Ra te: DECIMAL(8 ,3)Actu al_In ter est _Rat e: DECIMAL (8, 3)Depo sit_ Inte res t_Amt: DECIMAL(1 8,2 )Depo sit_ Acco unt _Amt: DECIMAL (18 ,2)Actu al_De pos it_Amt: DECIMAL(1 8,2 )Auto _Pro long _Ind : SMALL INTAuto _Pro long _Per iod_ Code : SMALL INTAuto _Pro long _Per iod_ Value : SMALL INTAuto _Pro long _End _Dat e: DATEPremat ure _Te rminat ion_ Ind: SMALLINTPremat ure _Te rminat ion_ Rate _Ind : SMALL INTInte res t_Ca lc_Met hod _Cod e: SMAL LINTInte res t_Ac cou nt_ Nbr: VARCHAR(3 5)Inte res t_Ac cou nt_ Nbr_ Mo difier : SMALL INTFu nd_ Rate _Pct : DECIMAL( 16, 9)Affiliatio n_Pa rty _Id: INTEG ERGro up_ Acco unt _Child _ind : CHAR( 1)Cont rac t_St atu s_Type _Cod e: SMAL LINTData _Valid atio n_Re sult _Cod e: SMAL LINTProd uct _id: INTEG ERPort folio_ Cahn nel_ Id: INTEGERMis _Pro duc t_Id : INTEGERPort folio_ Chan nel_ Id: INTEGERDepo sit_ Rene wed_ Ind: CHAR(1 )Addit iona l_Int ere st_ Rate : DECIMAL( 8,3 )Inte res t_Dis bm_Ty pe_ Code : SMALL INTDepo sit_ Ter min atio n_Ra te: DECIMAL(8 ,3)Curr enc y_Co nv_Ind : CHAR( 1)Invest me nt_ Prod uct _id: SMALLINT
CUSTOMER
CUSTOMER NUMBERCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX
ORDER
ORDER NUMBERORDER DATESTATUS
ORDER ITEM BACKORDERED
QUANTITY
ITEM
ITEM NUMBERQUANTITYDESCRIPTION
ORDER ITEM SHIPPED
QUANTITYSHIP DATE
R/370
R/372
R/374R/378 R/379
HOST_PARTY
Host_IDIdentification_Nbr
HOST_PARTY_IDENTIFICATION_HISTORY
Host_ID (FK)Identification_Nbr (FK)Start_Date
Master_Party_ID (FK)End_Date
MASTER_PARTY
Master_Party_ID
HOST_PARTY_ACCOUNT
Host_ID (FK)Identification_Nbr (FK)Account_Nbr (FK)Account_Nbr_Modifier (FK)Start_Date
End_Date
HOST_PARTY_RELATION
Host_ID (FK)Identification_Nbr (FK)Related_Identification_Nbr (FK)Related_Host_Id (FK)Start_Date
End_Date
28
© Swedbank
PHYSICAL SERVERS - PRODUCTION
ODI Topology usage example
ODI Server Name: PROD_CORE_EEServer Name: TALLINN (LDAP)
Schema: CARD
ODI Server Name: PROD_CORE_LVServer Name: RIGA (LDAP)
Schema: CARD
ODI Server Name: PROD_DW_GRServer Name: EDW.DOMAIN.EE (IP)
Schema: MAIN
LOGICAL SCHEMAS
CONTEXT: PROD_EE
Logical Schema: CORE_CARD Logical Schema: DW_MAIN
CONTEXT: PROD_LV CONTEXT: PROD_EECONTEXT: PROD_LVCONTEXT: PROD_GR
• Logical schema is mapped thru Context to Physical Server and Physical Schema
29
© Swedbank
Features of ODI topology
• Physical server has fixed user name and password• One logical schema can map to exactly one physical
schema in one contextTo make multiple users in same database – define more contexts or duplicate
the datamodel
• Logical schema cannot change technology
Conclusion – database schema is needed to be defined as many times as many database users have
Single shared database connection is preferred to maximize ELT –> compromise on resource management on database side by user names
© Swedbank
ODI developer basic steps
1. Reverse engineer data models from source and target
2. Define column level data mappings, specify join and filter conditions. Every data mapping (odi interface) can have exactly one target and
multiple sources
3. Select knowledge module (code generator)
4. Generate code (odi scenario) and execute scenario
31
© Swedbank
ODI AgentODI Designer
ODI scenario generation and execution
Package Scenario
Knowledge modules
Interfaces
Data Objects
Code Generation
Code Execution
Context(Topology)
Runtime variables
• When knowledge module changes – rebuild and deploy all related scenarios• When database objects change – refresh data structure definitions from source database, rebuild
and deploy all related scenarios
DB 1
DB 2
Connect & executecommands
Connect & execute
commands
32
© Swedbank
Custom components to manage 500 ETL processes
• Process registry– all processes and their dependencies
• Common wrapper – special scenario wrapping all others
• ODI monitor– Web access to process registry
• Release builder– Used for deploying from test to developement
33
© Swedbank
Process registry
• List of all ETL processes regardless of technology- Create, change, retire process- All necessary information for maintaining the list
• Process scheduling information• Dependencies between processes
– Process to process dependencies– Dependencies thru “Dependency Group”– Based on process bookmarks
© Swedbank
Common Wrapper
• Special 1 instance ODI scenario, thru which all other scenariosare executed (pre and post steps)
• Implements common functionality needed for all processes- Checks if preliminaries of process have been filled
- Checks if process allowed to run at the moment.
- Assigns common process control variables and passes its values to executed scenario
- Logs execution bookmarks, odi session ids, run result
- Alerts monitoring in case of failure
© Swedbank
Custom components overview
© Swedbank
Dependency group
• Defining dependency group - is the data content what process delivers. It corresponds to business concept / subject area + data availability.
• Proceses are either:– Suppliers of Dependency group– Consumers of Dependency group
• Dependency groups are also used for show the data availability bookmarks for users in ad-hoc reporting environement
37
© Swedbank
Dependencies between processes
Credit Agrmt Dly
Loan Agrmt loadingBank Account loading
Fin Agrmt Dly Fin Agrmt Bal Dly
Value added calculation 1
Is Supplier for
Is Consumer of
Factoring Agrmt Loading
Is Supplier forIs Supplier for
Leasing Agrmt Loading
Value added calculation 2
Is Consumer ofIs Consumer of
Consuming
processes
Dependency
Groups
Supplying
processes38
© Swedbank
METADATAUSER INTERFACE
TECHNICAL OPERATIONAL METADATA
Enterprise metadata context
RDBMS metadataCASE tool
(Logical data models)
Transformationmetadata
(ETL tools)Presentation metadata
(Reporting tools)
EnterpriseMetadataRepository
Manual Metadata(Services, business requierments
etc)Metadata reports
39
© Swedbank
PROCESS
Process_Name
Service_Component_ShortName (FK)ETL_Server_NameProcess_Status_ShortNameProcess_Executable_Name (FK)
PROCESS_EXECUTABLE
Process_Executable_Name
Package_Name (FK)
PACKAGE
Package_Name
PACKAGE_SOURCE_OBJECT
Package_Name (FK)DB_Object_Name (FK)Impact_Layer_Name (FK)
PACKAGE_TARGET_OBJECT
Package_Name (FK)DB_Object_Name (FK)Impact_Layer_Name (FK)
PROCESS_PARAM
Process_Name (FK)Param_Name
Param_Value
PROCESS_DEPENDENCY
Process_Name (FK)Dependency_Group_Name (FK)
DEPENDENCY_GROUP
Dependency_Group_Name
PACKAGE_SOURCE_LAYER
Package_Name (FK)Impact_Layer_Name (FK)
PACKAGE_TARGET_LAYER
Package_Name (FK)Impact_Layer_Name (FK)
DB_OBJECT
Impact_Layer_Name (FK)DB_Object_Name
Service_Component_ShortName (FK)
IMPACT_LAYER
Impact_Layer_Name
Service_Component_ShortName (FK)
PROCESS_RUN
Process_Name (FK)Process_Execution_Dtime
Process_Boomark_Values
PROCESS_SCHEDULE_TIME
Process_Schedule_Type_CodeProcess_Schedule_NoProcess_Name (FK)
Frequency_TypeFrequency_Value
SERVICE
Service_Component_ShortName
Metadata model – ETL related
MANUAL CONFIGURATION
TRANSFORMATION
PROCESS REGISTRY
Sources of metadata:
40
RDBMS
© Swedbank
Process execution preliminaries
41
© Swedbank
Process execution Daemon
• Planned component for automatic ETL workflow management, start process when:– It is time to process new data
– Preliminaries are ready
– Process run is allowed
• Replacement of enterprise job scheduler
• Utilizing framework of Process Registry and CommonWrapper
© Swedbank
Our experience with ODI (10g)
• Performance concerns– Educate developers to use existing patterns– Optimize knowledge modules, while keeping them as generic as possible– Made lightweight quick web application for accessing execution logs
• Functionality– Modified almost every KM which is now in use– Created new KMs for common needs (new history integration, SAX xml
parsing for loadings, streamed xml output etc.)– Made workarounds for missing features: OLAP function support, sub queries– Utilized ODI code substitution framework to maximum– Made command line utility to start ODI session on remote Agent– Use DTS Agent for scheduling – single high-level workflow management
system
© Swedbank
Our experience with ODI (10g) , continued
• Deployment– We use single ODI project per Area – shared sets of KMs and Variables– To test – install separately changed data models, knowledge modules and odi
folders (common releasable unit, based on custom export script)– Huge ODI project import operation required custom solution to do incremental
restore for whole project.• ETL Administrator concerns
– no way to change the code directly in production (in case of urgent issues)
© Swedbank
2008 started ETL processes migration from MS-DTS to ODI . Current status:
Number of ETL processes
261 257
0
50
100
150
200
250
300
ODI DTS
Number of tasks in ETL processes
3819
3224DTS
ODI
45
© Swedbank
Questions?
46