+ All Categories
Home > Documents > My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle...

My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle...

Date post: 28-Dec-2015
Category:
Upload: dale-griffin
View: 216 times
Download: 1 times
Share this document with a friend
23
My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer
Transcript
Page 1: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

My experience building a custom ETL system

Problems, solutions and Oracle quirks

orHow scary Oracle can look for a Java developer

Page 2: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Agenda

• WHY do we need an ETL?• HOW it works• Experience:– task– existing solutions?– problem or Oracle quirk– my solution

Page 3: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

WHY do we need an ETL?

• OLTP - in most cases calculation is non-trivial:– SQLs grow in size & complexity– increased maintenance effort– poor SQL performance

• business values calculation is implemented in most of reports independently - no code reuse:– maintenance effort is multiplied by number of reports– copy-paste-driven development

Page 4: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

WHY do we need an ETL?

OLTP-like RC• 46 lines• 7 joins• 3 levels

after RADIO• 13 lines• 3 tables• 1 level

Balance / UPL calculation:

Page 5: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Streams• Propagate changes to RC - replica tables

Triggers• Collect transaction info• Collect changes – change log tables

Processing routines• Read changes• Analyze transaction info• Denormalize, calculate additional fields• Insert into denormalized tables• O!

RADIO: Data flow

Routines:• FLAG – extract rows/entities

for each event in transaction, sort

• MAIN – given event rows, run actions

• SNAP – when we have all TX from OLTP snap – calculate appropriate DN snap

Page 6: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Radio: Data flow

CLOG tables

FLAG job

MAIN job

DN tables

TX info

Action

CLDao

Dao

DNDao

Page 7: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

PLSQL code generator

>600kB of pl/sql code:• TX element row create type as object• resulting DN row create type as object• action for each event type create type as object

Code maintenance is pain use higher level language !!

JPLSQL = java+pl/sql: jsp-like parser for producing pl/sql.• XML-based DB structure• XML-based flag/action mapping• power of Java

Page 8: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Streams, Triggers & CLOGs

• after trigger• my equals• duplicate scn

Page 9: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

After trigger

• we keep TX apply state in package variables• before trigger is invoked• SUDDENLY!, transaction is rolled back –

package variables stay altered!

Use only after triggers

Page 10: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

My equals

We need to filter changes, that happened in columns we don’t collect. But what we do with Oracle’s null ?

nvl(:new.val = :old.val, :new.val is null and :old.val is null)

Simple inline in JPLSQL.

Page 11: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Streams duplicate SCN

Ingredients:• several sessions• several tables• Streams replication

Apply process can produce message with same SCN.Oracle BUG ID: ???

Page 12: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing

• Job control• FLAG MAIN communication• MAIN

Page 13: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: JOB control

• identification – how do we know a job is running ?• communication – how do we communicate a job ?– dbms_alert has implicit commits– dbms_pipe is not compatible with RAC

• sleep – conditioned wait

Page 14: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: FLAG

•250 lines of SQL•300 lines of explain plan•1 kTX p/second

Page 15: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: FLAG

FLAG computes:– table of

• table of– table of

» number

– event types

• event occasions– table index for this event

» TX row id

3-dimensional table problems:• ordering (no order by in collect statement in 10g)• storing – nested table doesn’t preserve ordering

FLAG job

MAIN jobTX info

Page 16: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: MAIN

What is different from Java• object has default constructors – very useful for bulk creation• encapsulation is bad - package method access is slower, than variable access• reading from package variable is much, much faster, than reading from tables

cache everything

Page 17: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

What is very different from Java• object/record assignment works by value, not by reference

Processing: MAIN

Page 18: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: MAIN

Java-like toString:• get all object fields using user_source view• execute immediate …• very useful for debugging

Page 19: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Processing: MAIN

Tom Kyte’s “when others” rule exception:• we really want to catch all kind of errors:

– infrastructure logic– business logic constraints– Oracle internal errors

• we really want to stop after any error

Page 20: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Post-processing: Deployer

Relieves system engineers from deployment paint• Read installation bundle• Read DB objects• Compute difference• Build patch

Each object type has it’s own:• create / change statement syntax• system view structure

Page 21: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Post-processing: Deployer

Oracle has object dependencies:• pl/sql depends on tables• tables depend on user types• user types depend on their parent types

Page 22: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Misc

• SNAP – Trade/RC scn mapping• datatest – xmlforest, emails• very slow dbms_output retrieval

Page 23: My experience building a custom ETL system Problems, solutions and Oracle quirks or How scary Oracle can look for a Java developer.

Questions ?


Recommended