Oracle Data Integrator
BGOUG Spring Conference, 2009
Oracle Data IntegratorOverview and Demo
Maria Mundrova
ETL Developer/DW Consultant
Presentation Agenda
w Oracle Data Integratorw What is ODI?w ELT Approachw Architecture
w Oracle Data Integrationw The Three Rightsw Scenariosw Build ETL Process
2 Oracle Data Integrator
w Build ETL Processw Interfaces, Sources/Targets, Mappings, Filters, Transformations
w Oracle Data Profilingw Oracle Data Qualityw ETL Platforms Overvieww DW Best Practices for Oraclew ODI Case Studiesw Q & A
What is ODI?
w Data Integration Toolw ELT for Data Warehousew Data movement, migration and consolidationw Transformation from multiples sources to heterogeneous targets
w Data Profiling and Data Quality
3 Oracle Data Integrator
w Data Profiling and Data Qualityw Provides Data Services to Oracle SOA Suite
w Data transformations on source or target databasew Leverages the power of the databasew No need for ETL Server w Describe what the process does and not how
ELT Approach
4 Oracle Data Integrator
Architecture
w Graphical modules, Runtime components, Repository
Installable on any platform that supports Java 1.5, including Windows, Solaris,
Security ManagerManage user privileges
OperatorOperate productionMonitor sessions
Topology ManagerDefine the IS infrastructure
DesignerReverse-EngineerDevelop ProjectsRelease Scenarios
Any Web BrowserBrowse metadata lineageOperate production
5 Oracle Data Integrator
Java 1.5, including Windows, Solaris, Linux, HP-UX, pSeriesFile system space: approx. 200Mb
Repository
Scheduler AgentHandles schedulesOrchestrate sessions
Metadata NavigatorWeb access to the repository
Oracle, DB2, any ISO-92 RDBMSApprox. 500 Mb Master (security, topology) + 500 Mb Work( models, projects, execution)
Oracle Data Integration
w Data Integration is referred to as ETLw Get the right data in the right place at the right timew Scenarios: ELT for Data Warehouse, MDM
6 Oracle Data Integrator
Build ETL Process
We need to:
w Create and reverse-engineer modelsw Create Project, Folders, Packagesw Proceduresw Import Knowledge Modules
7 Oracle Data Integrator
w Import Knowledge Modulesw R(everse), L(oad), I(ntegrate), J(ournalize), C(heck), S(ervice)
w Create Interface:w Define Targetsw Define Sources, Filters, Transformationsw Define mappings between source and targetw Define flow control
w Execute
Expressions, Joins, Sources, Targets, Mapping
8
Oracle Data Profiling
w Investigate dataw Examine dependencesw Create joinsw Check data compliance (patterns)w Define business rulesw Assess data though metrics
9 Oracle Data Integrator
Oracle Data Quality
w Data cleansingw Data enrichmentw Data standardizationw Repair and correct fields, values, records
Data Data Data Data ValidationValidationValidationValidation
10 Oracle Data Integrator
Understanding Understanding Understanding Understanding the Datathe Datathe Datathe Data
Managing Managing Managing Managing Data Quality Data Quality Data Quality Data Quality Issues Issues Issues Issues (Tactical)(Tactical)(Tactical)(Tactical)
Addressing Addressing Addressing Addressing Data Quality Data Quality Data Quality Data Quality Issues Issues Issues Issues (Strategic)(Strategic)(Strategic)(Strategic)
Enriching the Enriching the Enriching the Enriching the DataDataDataData
ETL Platforms Overview
w Oracle Data Integration Enterprise Edition w Informaticaw Ab Initiow IBM DataStagew Microsoft DTS
11 Oracle Data Integrator
DW Best Practices for Oracle –Bitmap index
w Bitmap index – low cardinality columns; best suited for DSS regardless of cardinality
select t.cust_id , t.cust_gender, t.cust_marital_status, t.cust_income_levelfrom customers t
12 Oracle Data Integrator
select count(*) from customerswhere cust_marital_status='married'and cust_gender='M'and cust_income_level in ('F: 110,000-129,000', 'I: 170,000-189,000')
DW Best Practices for Oracle –Bitmap join index
w Bitmap join index on fact table sales for cust_gender
CREATE BITMAP INDEX sales_cust_gender_bjindxON sales(customers.cust_gender)FROM sales, customersWHERE sales.cust_id = customers.cust_idLOCAL NOLOGGING COMPUTE STATISTICS;
13 Oracle Data Integrator
LOCAL NOLOGGING COMPUTE STATISTICS;
SELECT sales.time_id, customers.cust_gender, sales.amount_soldFROM sales, customersWHERE sales.cust_id = customers.cust_id;
Join result used to create the bitmaps stored in the bitmap join index
DW Best Practices for Oracle –Exchange partition
w Load Technique for large tables No physical move just reset pointersw Steps
1. Create partition table TEST(Destination)2. Create temp table TEST_TMP3. Load records inn TEST_TMP4. Add Local PK to the partition table TEST5. Add PK to the offline TEST_TEMP table
14 Oracle Data Integrator
5. Add PK to the offline TEST_TEMP table6. Gather optimizer statistics on TEST_TEMP7. Swap offline table into the partition
DW Best Practices for Oracle –Exchange partition
CREATE TABLE test (id NUMBER(12,6),description VARCHAR2(10),data VARCHAR2(100))
PARTITION BY RANGE(id) ( -- Partion Key = Primary KeyPARTITION test_partition VALUES LESS THAN (MAXVALUE));
CREATE TABLE test_temp (id NUMBER(12,6),description VARCHAR2(10),data VARCHAR2(100));
INSERT /*+ append ordered full(s1) use_nl(s2) */INTO test_tempSELECT
TRUNC((ROWNUM-1)/500,6),TO_CHAR(ROWNUM),RPAD('X',100,'X')FROM
15 Oracle Data Integrator
FROMall_tables s1,all_tables s2
WHEREROWNUM <= 10000;
ALTER TABLE testADD CONSTRAINT pk_test PRIMARY KEY(id)USING INDEX (CREATE INDEX pk_test ON TEST(id) NOLOGGING LOCAL);
ALTER TABLE test_tempADD CONSTRAINT pk_test_temp PRIMARY KEY(id)USING INDEX (CREATE INDEX pk_test_temp ON test_temp(id) NOLOGGING);
ALTER TABLE TESTEXCHANGE PARTITION test_partition WITH TABLE test_tempINCLUDING INDEXESWITHOUT VALIDATION;
ODI Case Studies
Raiffeisen International
þ Replace existing ETL tool which did not provided scalability to populate data warehouses in 12 countries
þ Improve ETL performanceþ Reduce hand-coded ETL
InterBank, Netherlands
þ Increased development productivity
þ Executes loading process from 8 hours to 2 hours
þ Increased market share because of correct client profiling based on the risk
Nestle Nesspreso
þ Supply a data warehouse from consolidated operational database of all countries
þ Use horse-power of source and/or target RDBMS for transformation
16 Oracle Data Integrator
þ Reduce hand-coded ETL jobs
þ Improved DWH processing performance 40%
profiling based on the risk RDBMS for transformationþ Replaced weekly insertion with daily recording of change for all clients
þ Improved target campaigns and activities
Practical demo
w Work with w Designerw Operatorw Repositoryw Security Manager
w Transform and load SH schema
17
w Transform and load SH schemaw Investigate with Data Profiling
Conclusion
w Automates ETL processw Provides Data Qualityw New tool for Oraclew Recommend itw ODI Needus trilogy ☺
18
Thank you!
19
Q & A
Oracle Data Integrator