Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration: CON7922Tame Big Data with Oracle Data Integration
Alex KotopoulisSenior Principal Product ManagerOracle Fusion Middleware, Data Integration Solutions
Michael RaineyPrincipal ConsultantRittman Mead
Oracle OpenWorld 2014 2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Oracle OpenWorld 2014 3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 4
Agenda
Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information
• OOW Data Integration Sessions and Additional Resources
3
4
1
2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 5
Oracle Data Integration Solutions and Proven Benefits
Improve Agility• Deploy Projects Faster• Reliable Real-Time
Reduce Risk• Popular, Proven Tools• Open, Not Proprietary
Reduce Costs• Better Productivity• Eliminate ETL Servers
Analytic Data Integration• Big Data Integration & Governance• Data Warehouse Integration• Business Intelligence Applications
Enterprise Data Integration and Governance• Enterprise Data Quality and Profiling• Comprehensive, Heterogeneous Data Integration• Business Glossary and Metadata Management
Business Continuity• Active-Active for Maximum Availability• Zero Downtime Migrations• Data Consolidation / Application Modernization
24 x 7 x 365
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 6
Comprehensive Data Integration & Governance CapabilitiesReal-Time Data Movement
– Low impact capture, stage in Hadoop– Continuous data availability
Data Transformation– Bulk data movement– Pushdown data processing
Data Federation– Virtualized Data Services
Data Quality & Verification– Fix quality at the source– Verify data consistency
Metadata Management– Lineage and Impact Analysis– Business Glossary Semantics
Data GovernanceFoundation
Oracle Data Integrator(Transformation)
Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)
FastLoad
Oracle GoldenGate(Movement)
Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator(Federation)
GoldenGate Veridata(Online Data Verification)
ELT Processingon Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 7
Data GovernanceFoundation
Differentiated Technical ApproachDynamic Data Movement
– Real-time CDC is by default, not ETL– Least invasive on sources– Proven best performance– Integrated Oracle capture/apply
No ETL Engines– Take the processing to the data;
don’t move the data to the process– Leverage your data engines for the
workloads (Hadoop or SQL)
Most Heterogeneous– Leverage open source Hadoop, not
proprietary distributions– Hadoop is the Hub, not ETL tools– Open metadata standards
Oracle Data Integrator(Transformation)
Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)
FastLoad
Oracle GoldenGate(Movement)
Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator(Federation)
GoldenGate Veridata(Online Data Verification)
ELT Processingon Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Reservoir Use Case with Oracle Data Integration
Oracle Confidential – Internal/Restricted/Highly Restricted 8
Oracle Data Integrator
Logs
OLTP Databases
Social Media
Sensor Data
Data Warehouses,Datamarts
Pig
Sqoop Initial Load Sqoop Load
OLH / OSCH
Big Data SQL
File Load
CDC to HDFS, Hive, Flume, HBase
Oracle GoldenGate
Oracle EnterpriseMetadata Management
Oracle Data Service Integrator
Federated Queries
Oracle EnterpriseData Quality
Impala
Transformations with HDFS, Hive, Hbase, Pig
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Logical and Physical Design with ODI
LogicalDesign
Oracle
MySQL
Hive
PhysicalDesign
Sqoop
Sqoop
IKM
LKM
LKM
Oracle
Hive
MySQL
Hive
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Design Once, Run Anywhere• Use native technologies for any data
source– Data Locality– Optimal performance, reduced
network traffic• No proprietary middle tier
– Reduced infrastructure cost and maintenance effort
• Declarative design– Simplified development– Reusable across technologies
Hive
Agent
Languages and Tools
Runtime Environments
SqoopBig Data
SQLFuture
Languages
Future RuntimeEngines
OLHOSCH
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle GoldenGate Adapter – Big Data Use Cases
Oracle Confidential – Internal/Restricted/Highly Restricted 11
Java Adapter
HDFS file
Capture Parameter
File
Adapter Property file
Adapter Jar file
Source Database
PumpParameter file
Hive
HBase
Flume
Source Channel Sink
OtherCustom Targets
Log File PumpTrailFile
Capture
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 12
Agenda
Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
13
Introduction• Michael Rainey• Principal Consultant - Rittman Mead• Oracle Data Integration expert
– Oracle Data Integrator and Oracle GoldenGate
• Oracle ACE• Twitter: @mRainey
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
14
About Rittman Mead• Oracle Gold partner
– World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI
– Provide consulting, training, managed services for customers worldwide
• 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1 Oracle ACE Associate– All expert in Oracle BI, DW, EPM and Analytics tech– Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle
OLAP, GoldenGate, Exadata, Endeca
• Blog: www.rittmanmead.com/blog Twitter: @rittmanmead
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
15
Customer Challenge• Company has subscribers with in-home devices• Company wishes to improve customer experience• Log data can potentially help identify issues, but difficult to access and read• …and there’s a lot of data!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
16
Big Data Solution• 6 Node Big Data Appliance (BDA)
Extract data from XML logs via python script
Load data to HDFS using copyFromLocal command
Filter, format, sort data using Oracle R
Aggregate & transform data using python scripts & HiveQL
Load to Oracle DB via Sqoop
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
17
Wait, this looks familiar…• Looks like a standard data integration project!
• Scripts written to extract, load, and transform data• Source data and transformations evolving
• But something is missing– Scheduling, process flow, monitoring, data quality– Standardization and maintainability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
18
Transition to an ETL tool• Initial thought…Informatica
– Client has experience with product
• Why Oracle Data Integrator?– Extensibility - “Design Once…”– No middle ETL engine– Data Quality
• And…it’s licensed with their BDA!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
19
ODI ProcedureIKM Hive Transform
IKM File-Hive to SQL (SQOOP)
Big Data Solution using ODI 12c
Extract data from XML logs via python script
Load data to HDFS using copyFromLocal command
Filter, format, sort data using Oracle R
Aggregate & transform data using python scripts & HiveQL
Load to Oracle DB via Sqoop
IKM Hive Control Append
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
20
What we learned along the way…• HiveQL <> Oracle SQL
– Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in this format rather than the “Oracle” format.
• Begin with scripts, but have ODI Application Adapters for Hadoop in mind• Utilize the skills your available resources have
– Not everyone can write MapReduce code
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 21
Agenda
Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Integration Demo
Oracle Confidential – Internal/Restricted/Highly Restricted 22
Oracle Data Integrator
Oracle GoldenGate
Flume
Process Activity(Hive)
Application Logs
Activity
Load Oracle Big Data SQL
ActivityClean CountrySales
Load Oracle OLH/OSCH
MySQL DB
SQOOP
OGG(HDFS/Flume)
MovieMovie MovieRating MovieRating
Customer
Calculate Rating(Hive)
Sessionize Activity(Pig OS Call)
Customer SessionStats
Calc Purchases(Oracle)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 23
Agenda
Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
2014
2014 Oracle Excellence Award Ceremony for Fusion Middleware Innovation
ORACLE FUSION MIDDLEWARE:CELEBRATE THIS YEAR'S MOST INNOVATIVE CUSTOMER SOLUTIONS
Tuesday, September 30, 2014 5:00-5:45pm YBCA Theater (next to Moscone North)Session ID: CON7029
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014
Resources
25
Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataintegration
Oracle Data Integrator
Oracle GoldenGate
Oracle EnterpriseData Quality
Oracle Enterprise Metadata Management
Oracle Data Services Integrator
http://www.oracle.com/us/products/middleware/data-integration/overview/index.html
Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Questions and Answers
Oracle OpenWorld 2014 26
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle GoldenGate2:45PM - CON7717 Oracle GoldenGate New Features & Options Product Update
4:00PM - CON7716 Oracle GoldenGate 12c for Oracle Database 12c
5:15PM – CON7719 Enabling Real-Time Data Integration for Big Data
10:45AM – CON7715 Oracle Active Data Guard & Oracle GoldenGate for HA
12:00PM – CON7328 Near-Zero Downtime Unicode Migration for Oracle
12:00PM – CON774 Oracle GoldenGate for Cloud
6:00PM – BOF9597 International Oracle GoldenGate User Group Meeting
3:30PM – CON7934 Tapping into the Big Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with Oracle Data Integration
4:45PM – CON7773 Oracle GoldenGate Performance Tuning for Oracle Database
10:45AM – CON7655 Achieving Zero Downtime During Oracle Application Upgrades & System Migrations
1:15PM – CON7718 Managing & Monitoring Oracle GoldenGate
Oracle OpenWorld 2014 27
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle Data Integrator
4:00PM – CON7899 Oracle Data Integrator: Product Update and Future Strategy
5:00PM – CON7820 Making he Move from Oracle Warehouse Building to Oracle Data Integrator
3:30PM – CON7934 Tapping into the Big Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with Oracle Data Integration
9:30AM – CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration
10:45AM – CON7923 Oracle Data Integration & Metadata Management for Seamless Enterprise
2:30PM – CON7921 Insight into Action: Business Intelligence Applications and Oracle Data Integrator
Oracle OpenWorld 2014 28
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Enterprise Data Quality
11:45AM – CON7776 Data Quality Maturity Journey: Building Toward Strong Enterprise Data Quality
10:45AM – CON7780 Oracle Enterprise Data Quality: Product Overview and Roadmap
2:00PM – CON7775 The Essential Core of Data Governance with Oracle Enterprise Data Quality
3:30PM – CON7934 Tapping into the Big Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with Oracle Data Integration
12:00PM CON7931 Solving Big Data’s Big Problem with Data Preparation & Enrichment in the Cloud
Oracle OpenWorld 2014 29
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Hands-on Labs @ OOW ’14Tuesday 3:45PM – HOL9439• Oracle Data Integrator 12c New
Features Deep DiveTuesday 5:15PM – HOL9414• Oracle Data Integrator for Big Data
Hotel NikkoNikko Ballroom II22 Mason Street
Monday 1:15PM – HOL9437• Oracle GoldenGate 12c New
Features Deep DriveWednesday 4:15PM – HOL9436• Pushing Transactions to JCache with
Coherence and GoldenGateThursday 10AM – HOL9413• Oracle GoldenGate Heterogeneous
Replication
Monday 2:45PM – HOL9438• Oracle Enterprise Data Quality
Introduction
Oracle OpenWorld 2014 30
OGG
ODI
EDQ
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32