+ All Categories
Home > Documents > 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

Date post: 18-Jan-2018
Category:
Upload: virgil-armstrong
View: 218 times
Download: 0 times
Share this document with a friend
Description:
3 Contents  What is ETL  ETL tools vs. ‘handcraft’ code  PL/SQL techniques
31
1 Do You Need an ETL Tool? Ben Bor Ben Bor NZ Ministry of Health NZ Ministry of Health
Transcript
Page 1: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

1

Do You Need an ETL Tool?

Ben BorBen BorNZ Ministry of HealthNZ Ministry of Health

Page 2: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

2

Ben Bor

Over 20 years in IT, most of it in Information Management Oracle specialist since version 5 Involved in Business Intelligence for over 10 years Consulted the world’s largest corporations Presents regularly on Information Management Was annual Guest Lecturer at Sussex University

Page 3: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

3

Contents What is ETL ETL tools vs. ‘handcraft’ code PL/SQL techniques

Page 4: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

4

What is ETL

ETL = Extract, Transform and Load: Any source, target ; Built-in complex transformations

Point-to-point vs. hub-and-spoke

Page 5: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

5

Traditional ETL

Page 6: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

6

Our Own ETL Requirements

FlatFiles

SQL

Loader

PL/SQL PL/SQL

Data Quality

Page 7: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

7

Travel Company Example

Aurora

CTQ

3rd PartyData

OracleFinancials

Calypso

150,000 Travel Agencies500 Groups

50 Consortia500,000 Consultants

3 million Bookings1 million Brochure Requests

400,000 Questionnaires

Brochure Reqs1 million Website

Others

Supplier35,000 Supplier Types

Employees

Australian Reservations

LoadArea

Oracle Staging Area

CleansedData

AuditReport

No existingprocess

3rd PartyMarketing

CRM

FileMaker

DQE

Business Group

ManualDataEffort

Cleansed

data

Business

Rules

ProgressReport

250,000

Estimated Volumes

Tropics

Key:

future system

existing system

feed back

Page 8: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

8

Tools or Handcraft?

ETL Advantages: Graphic User Interface Automatic documentation Off-the-shelf set of ready-to-

use transformations Built-in scheduler Database Agnostic

Handcrafting Advantages: No limitation reuse existing code & non

ETL No specific methodology No license cost No impact on infrastructure Transportable Release & Code-

Management by script

Page 9: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

9

Oracle ETL Facilities

External Tables Merge SQL Loader PL/SQL Database links

Page 10: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

10

Why Use PL/SQL

Integrated environment (no installation required) Available resources Reuse code ‘snippets’ Good performance Integration with and control of the database

Page 11: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

11

PL/SQL Tips and Techniques

1. Quality2. Techniques3. Tricks

Page 12: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

12

Quality

Page 13: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

13

What is Quality?

[1] “Totality of characteristics of an entity that bears on its ability to satisfy stated and implied needs.“

[The ISO 8204 definition for quality]

Page 14: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

14

Quality 2

[2] Quality is a collection of “ilities”: Reliability - operate error free Modifiability - have enhancement changes made easily Understandability - understand the software readily Efficiency - the speed of the software Usability - use the software easily Testability - construct and execute test cases easily Portability - transport the software easily

Page 15: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

15

Quality 3

[3] “All the things you do today in your software development, in order to bear fruit in the future.”

Page 16: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

16

Standards & Conventions

Use meaningful namesV_Number_Of_Items_In_Array vs. i or no_itms

Distinguish between types:V_ Variablea_ ParameterC_ ConstantG_ Global constant

Page 17: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

17

Using Packages

Central package with utilities and all output All error messages and numbers All common constants (date format etc’) Global variables Statistics data

Other packages encapsulate related logic Within package:

Procedures & functions have: Meaningful name A99_ prefix. A is the level (A highest). 99 unique ID

Page 18: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

18

Example: procedure and variable naming

XXX_Write_Flat_File.U03_Write_Record_To_CSV(a_File_Handle,C_Field_Delim,C_Field_Separ,C_Record_Separ,RM_REFERENCE_rec.REFTYPE,RM_REFERENCE_rec.CODE, RM_REFERENCE_rec.DESCRIPTION,

To_Char(RM_REFERENCE_rec.ISDEFAULT , '9')) ;

Page 19: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

19

TechniquesError logging Autonomous TransactionRun statisticsRelease mechanismOverloading

Page 20: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

20

Error Logging Technique

Global variables keep key information: Record ID Run ID Location in code

Local error trapping decides severity and error code.

All error trapping passed up.

Page 21: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

21

Error Logging Structure

TABLE ERROR_LOG( ERR_TIME DATE, ERR_NUM INTEGER,

SOURCE_URN VARCHAR2(20),SOURCE_SYSTEM_ID VARCHAR2(5),PLACE_IN_CODE VARCHAR2(64),ERR_LOCATION VARCHAR2(255),ERR_DESCRIPTION VARCHAR2(512),SEVERITY NUMBER(6) )

ERR_TIME 18-OCT-02 10:04:52ERR_NUM 1001SOURCE_URN 223010913SOURCE_SYSTEM CRSPLACE_IN_CODE In FLIP_PKG B06 ; 6(utils A08)ERR_LOCATION A08_Lookup_TypeERR_DESCRIPTION No match found for [Plan_Code] value [C3]SEVERITY 10

Page 22: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

22

-- ===================PROCEDURE E00_write_error_log(-- ===================

a_err_num IN integer ,a_Severity IN Integer ,a_err_location IN VarChar ,a_err_description IN VarChar )

ISPRAGMA AUTONOMOUS_TRANSACTION;V_Place_In_Code DW_Process.Error_Log.Place_In_Code%Type;

BEGINV_Place_In_Code := G_Place_In_Code || '(utils ' || G_Place_In_UTILS_Code || ')' ;INSERT INTO DW_Process.Error_Log

(err_time, err_num, Severity,BOROUGH_ID, SOURCE_URN, SOURCE_SYSTEM_ID,Place_In_Code, err_location, err_description)

VALUES(sysdate, a_err_num, a_Severity,G_BOROUGH_ID, G_SOURCE_URN, G_SOURCE_SYSTEM_ID,V_Place_In_Code, a_err_location, a_err_description) ;

COMMIT ; -- commit the autonomous transaction, outside transaction is unaffected.G_Stats_Rec.TOTAL_NO_OF_ERRORS := G_Stats_Rec.TOTAL_NO_OF_ERRORS + 1 ;

-- ===================END E00_Write_Error_Log ;-- ===================

Autonomous Transaction

Page 23: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

23

Run Statistics

G_Stats_Rec is a record with all the statistics fields Defined in the central package (therefore resident in memory) It is updated by the writing procedures (all central) It is written out at the end of the run

Page 24: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

24

Release Mechanism

Table of ‘release notes’ Each package has C_Version constant updated each

release ‘Show_Version’ scripts display versions and notes Results shipped with each release

Page 25: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

25

Remove Spaces

-- ===================FUNCTION A04_Remove_Spaces(-- ===================

a_Instring IN Varchar )Return Varchar

IS /*

** Removes all the spaces from a string, leaving the rest of the printable characters*/

BEGING_place_in_UTILS_code := 'A04' ; -- For use by the error trapping routine

RETURN TRANSLATE( a_Instring,'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890’ || '\|,<.>/?#~@;:[{]}=+-_`¬!"£$%^&*() ','abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890’ ||'\|,<.>/?#~@;:[{]}=+-_`¬!"£$%^&*()' ) ;

-- ===================END A04_Remove_Spaces ;-- ===================

Page 26: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

26

Strip Leading non-numerics

-- ============================FUNCTION F09_Strip_Leading_non_digits(-- ============================ a_String IN VARCHAR2 )

RETURN VARCHAR2IS /*

** Remove leading non-digits from the input.** Example: Input string: 'abcde12345edcba' ** Output string: '12345edcba' */v_string Varchar2(4000) ;v_first_digit_pos Integer ;

BEGIN-- Replace all digits by 0 v_string := Translate(a_String, '1234567890' , '0000000000') ;v_first_digit_pos := instr(v_string,'0') ;RETURN F01_Right(a_String, v_first_digit_pos ) ;

-- ============================END F09_Strip_Leading_non_digits;-- ============================

Page 27: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

27

Overloading

-- =======================PROCEDURE U03_Write_Record_To_CSV(-- =======================

a_File_HandleIN utl_file.file_type ,a_Field_DelimIN VarChar , -- the quotes, for CSVa_Field_SeparIN VarChar , -- the comma , for CSVa_Record_Separ IN VarChar , -- the Carriage Return + Line feed , for CSVa_String1 IN VarChar := G_default_Value ,a_String2 IN VarChar := G_default_Value ,a_String3 IN VarChar := G_default_Value ,...)

ISBEGIN

IF a_String1 = G_default_Value THEN GOTO End_Of_Record ; END IF ;U02_Write(a_File_Handle, a_Field_Delim || a_String1 || a_Field_Delim) ;

IF a_String2 = G_default_Value THEN GOTO End_Of_Record ; END IF ;U02_Write(a_File_Handle, a_Field_Separ || a_Field_Delim || a_String2 || a_Field_Delim ) ;

IF a_String3 = G_default_Value THEN GOTO End_Of_Record ; END IF ;U02_Write(a_File_Handle, a_Field_Separ || a_Field_Delim || a_String3 || a_Field_Delim ) ;...

<<End_Of_Record>>U01_Write_Line(a_File_Handle, a_Record_Separ) ;

-- =======================END U03_Write_Record_To_CSV ;

----------------------------------------------------------------------------------------------------------------------------------------------------------------- =======================

Page 28: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

28

Summary

ETL or PL/SQL? Your choice. Consider:

Overall cost ‘Politics’ Convenience Portability Speed of development Reusability

IF PL/SQL : ensure Quality

Page 29: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

29

Thank you !

Page 30: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

30

Page 31: 1 Do You Need an ETL Tool? Ben Bor NZ Ministry of Health Ben Bor NZ Ministry of Health.

31

Thank you !

I can be contacted at [email protected]


Recommended