KENT GRAZIANO@KentGraziano | kentgraziano.com
AGILE DATA WAREHOUSING: USING ORACLE DATA MODELER (SDDM) TO
BUILD A VIRTUALIZED ODS
Agenda (#VirtualODS)
© Data Warrior LLC
Bio
Architecture and Approach› What is a Virtualized ODS?
Using SDDM for pattern-based stage tables
Using views to load the stage tables› Building the views in SDDM› Using MD5 columns for Change Data Capture
Building ODS views in SDDM› Using Analytic Functions in views
Generating the DDL› SQL Server› Oracle
1
2
3
4
5
6
#VirtualODS
My Bio
© Data Warrior LLC
› Senior Technical Evangelist, Snowflake Computing› Oracle ACE Director (BI/DW)› Certified Data Vault Master and DV 2.0 Practitioner› Data Modeling, Data Architecture and Data Warehouse Specialist› 30+ years in IT› 25+ years of Oracle-related work› 20+ years of data warehousing experience
› Former-Member: Boulder BI Brain Trust (http://www.boulderbibraintrust.org/)
› Author & Co-Author of a bunch of books› Blogger: The Data Warrior› Past-President of Oracle Development Tools User Group and Rocky Mountain Oracle User Group
#VirtualODS
Snowflake Computing is… …a Silicon Valley innovator
…built a new SQL data warehouse in the cloud
…with broad customer adoption
The Snowflake Elastic Data Warehouse is … …All-new, SQL compliant
No legacy code …Designed for the elastic cloud
…Delivered as a service Nothing to manage
Shameless Plug
© Data Warrior LLC
Available onAmazon.com
http://www.amazon.com/Better-Data-Modeling-Enhancing-Developer-ebook/dp/B00UK75LYI/
Shameless Plug #2: Also On Amazon.com
© Data Warrior LLC
NOW IN SPANISHTOO!
http://www.amazon.com/Check-Doing-Design-Reviews-
ebook/dp/B008RG9L5E/http://www.amazon.com/VERIFICAC
I%C3%93N-REALIZAR-REVISIONES-DISE%C3%91OS-
MODELOS-ebook/dp/B00NUS1GFM/
Architecture & Approach
© Data Warrior LLC
Goals
› New reporting environment
› Agile (i.e., quick delivery)
› Future Proof
Determination
› Use Data Vault 2.0
› Implement in Phases
#VirtualODS
Data Vault Definition
© Data Warrior LLC
The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional
areas of business.
It is a hybrid approach encompassing the best of breed between 3rdnormal form (3NF) and star schema. The design is flexible, scalable,
consistent and adaptable to the needs of the enterprise.
Architected specifically to meet the needs of today’s enterprise data warehouses
DAN LINSTEDT: Defining the Data VaultTDAN.com Article
#VirtualODS
Data Vault Components
© Data Warrior LLC
Copyright 2011 Dan Linstedt, used by permission
#VirtualODS
Phase 1: Operational BI
© Data Warrior LLC
Goals: 1. Support immediate business needs for operational reports 2. Provides architectural component (stage layer) that supports long term data warehouse (DW) framework3. Can be easily enhanced to accommodate information needs of other departments4. Foundation for eliciting solid analytic BI requirements
XLink(data source)
eRMS(data source)
DW Stage Layer
Virtual Operational Data Store (ODS)
BOBJ Operational Universe(s)
BOBJ Operational Reports
#VirtualODS
Phase 1: Operational BI
© Data Warrior LLC
Data Warehouse (DW) Stage Layer
› Based on source system structures
› May simply be replicated source tables
› Refreshed several times a day
› Perform change data capture in this layer to provide persistent, historical data for future reporting needs
Virtual Operational Data Store (ODS)
› Abstraction layer between source and report tool
› Views on stage layer initially
› Provides proper modeling for building the Operational Universe(s) for BI report tool
› Includes Business Names and Joins
#VirtualODS
Phase 2: Analytic BI
© Data Warrior LLC
Goals: 1. Provide foundation for long term analytics platform (single source of information)2. Create purpose-built Universe for analytic needs3. Enable managed self-service BI by making it simpler for users to find the reports they need
XLink(data source)
eRMS(data source)
DW Stage Layer
Virtual ODS
BOBJ Operational Universe(s)
BOBJ Operational Reports
Data Vault (Enterprise DW)
Virtual Data Marts BOBJ Analytics Universe(s)
BOBJ Analytical Reports & Dashboards
#VirtualODS
Phase 2: Analytic BI
© Data Warrior LLC
Data Vault
› Provides one consistent source of information for both operational and analytic information
› Source system agnostic structures
› Easier to adapt and extend in future than 3NF or star schema
› Can be easily expanded as new data is added to the data warehouse foundation layer
› Persistent, historical capture of transaction-level data› Allows meeting future unknown needs, as they arise
Addition of Data Vault should be transparent to BOBJ operational report users
› Modification to physical references in the universe hides the change from the users;; Operational universe still looks like “modified” source system structures
› Therefore, no rework of existing reports
#VirtualODS
Phase 2: Analytic BI
© Data Warrior LLC
› Virtual data marts also sourced from data vault
› Marts provide an abstraction layer between DW and Business Objects› Can be easily expanded as new data is added to the Data
Vault
› Easy to create new data marts for future business needs
Analytics universe(s) sourced from new virtual data marts
› Looks like proper star schema with facts and dimensions› Re-organizes the data to more effectively support
business reporting
› Enables long-term universe support by most common BOBJ development skill set
› Can be converted to physical data mart (if needed)› For performance in a future release
› For highly complex business rules
#VirtualODS
Building Pattern-Based Stage Tables
© Data Warrior LLC
Create Table Template› Include reusable meta
data columns
Reverse Engineer Source Table(s)› Copy and rename
Apply Template› Use built in
transformation script
› Alternative› Copy template table› Merge with copy of source
Re-order columns as needed
#VirtualODS
Create Base Stage Table
© Data Warrior LLC
01.Copy source
table
02.Rename (add _stg)
03.Remove source
indexes
04.Change schema assignment
05.Add or Change table
comment
06.Assign Stage classification
(if you have one)
07.NOTE: You could script all this!
#VirtualODS
Apply Table Template Transform
© Data Warrior LLC
Use Table Template and Transformation Script
Tools -> Design Rules -> Custom Transformations
Look for “table template” delivered script› No change needed
Create table called table_template (or change script)› With required columns and properties to be copied
Select “Apply”› Changes all tables in design
Note: can script all sorts of stuff› Check /datamodeler/xmlmetadata/doc
1
2
3
4
5
6
#VirtualODS
Use the Merge Tool
Alternate - Merging Tables
© Data Warrior LLC
Adding Standard Columns
› 5th button on tool bar
› Good for building denormalized reporting tables
› Also for one-offs to add standard columns
Combines Two Tables
› Click merge button, then template, then target
› Edit result as needed
a. Copy template table
b. Merge with table needing the columns
#VirtualODS
Finalize Stage Table Design
© Data Warrior LLC
01
02
03
Re-order columns› PRIM_KEY column is 1st
Add new PK constraint using PRIM_KEY column
Drop source PK constraint› Replace with Unique constraint
#VirtualODS
Final DW Stage Table
© Data Warrior LLC
Source table name + stg suffix
New calculated PK for each stage record
Indicator of original source system PK
Additional meta-data columns to support change capture, load time and source
#VirtualODS
Build Stage Load Views
© Data Warrior LLC
For db to db ELT type loading
Includes code for Type 2 SCD style CDC
Use SDDM View Builder› Select from source table (all columns)› Drag and drop› Alternate – Table to View wizard
› Add code from view template
Show code in DDL Preview
Test in SQL Developer› Fix› Repeat
1
2
3
4
5
#VirtualODS
Table To View Wizard
© Data Warrior LLC
Pick Tables to use
Auto create new subview diagram
Auto add PK & FK to views based on base table
#VirtualODS
View Builder
© Data Warrior LLC
Pick Syntax
Pick Tables & Columns
Add Calcs & Aliases & Filters
Add Complex Sub queries if needed
#VirtualODS
MD5 Keys & Columns
© Data Warrior LLC
Concatenate source data fields and hash to create MD5 keys & columns
MD5 Key Types
1
2
PRIM_KEY:› All source fields (in table
order) + LOAD_DTS› Uniquely ID’s all records
with DW› Can serve as an SCD-2
key in virtual Dim’s / Facts
HASH_KEY:› Source field(s) (in table
order) used by SOR to ID data rows uniquely for change data capture purposes
HASH_DIFF:› All non-CDC_KEY source
fields (in table order) to track deltas for change data capture purposes
#VirtualODS
MD5-Based Change Detection
© Data Warrior LLC
Think Type 2 SCD (Slowly Changing Dimensions)
Old Way:› Compare column by column› Source value != Current value in DW table
› 20 columns, then 20 compares
New Way:› Concatenate all columns to one string› Convert to one char(32) string with hash function› Compare to hashed value (HASH_DIFF) in target table› Does not matter how many columns
#VirtualODS
What Does It Look Like?
© Data Warrior LLC
Encode using standard MD5 hash function (Oracle)› rawtohex(sys.utl_raw.cast_to_raw(dbms_obfuscation_toolkit.md5 (input_string => ...)
Need to minimize chance of duplicates› 12||3||45 and 1||2||345 hash to same value› Need a separator between each› Also handles case of null values› Example: Col1||’^’||Col2||’ ’||Col3
#VirtualODS
Other Considerations
© Data Warrior LLC
To generate most consistent string: standardize!
Convert data types
If 'NUMBER', 'NVARCHAR2', 'NVARCHAR', 'NCHAR‘› THEN 'TO_CHAR(' || column_name || ')‘
If 'RAW‘› THEN 'ENC_BASE64(' || column_name || ')‘
If 'DATE‘› THEN 'TO_CHAR(' || column_name || ', ''YYYY-MM-DD'')‘
If LIKE 'TIME%‘› THEN 'TO_CHAR(' || column_name || ', ''YYYY-MM-DD HH24:MI:SS'')'
#VirtualODS
Template View Code – SQL Server
© Data Warrior LLC
-- SQL Server load view template columns PRIM_KEY, -- place holder for PK columnHASH_KEY, -- place holder for HASH KeyHASH_DIFF, -- place holder for CDC columnGETDATE() AS LOAD_DTS, -- current data and time'eRMS' AS REC_SRC – a source system name
-- Template WhereWHERE --supports load new keys and changes, no dupsNOT EXISTS( SELECT 1FROM dw_stage.rmcodp_stg stgWHERE stg.HASH_KEY = upper(CONVERT([Char](32),HASHBYTES('MD5', UPPER(RTRIM(RMC.CODCODTYP) + '^' + RTRIM(RMC.CODCODNUM) + '^')),2)) AND stg.HASH_DIFF = upper(CONVERT([Char](32), HASHBYTES('MD5', UPPER(RTRIM(CONVERT([Char](100),RMC.CODKEYNUM)) + '^' + ) …
#VirtualODS
Virtual ODS
© Data Warrior LLC
Simple database views on stage tables. Tables and columns renamed with business terms
FK Added to help BOBJ Developer define proper
joins
#VirtualODS
Defining The Virtual ODS Views
© Data Warrior LLC
Start with Table to View Wizard› On Stage Tables
Rename view
Used Excel & Metadata to create column alias› Extract metadata for stage tables (use SDDM Search)› Add calculated column to Excel › ="RMO."&E10350&" AS "&M10350&","
› Cut and paste into View Builder
Add nested table with analytic function› To only return current rows for ODS
#VirtualODS
Analytic Function To Get Current Rows
© Data Warrior LLC
SELECTCONVERT([Char](10),RMC.CODCODNUM) AS Business_Group_Code,RMC.CODKEYNUM AS Code_Key_Numeric,RMC.CODSYSTYP AS System_Value_Type,RMC.CODLNGDES AS Description,…RMC.LOAD_DTS AS LOAD_DTS,CASEWHEN RANK() OVER (PARTITION BY RMC.HASH_KEY
ORDER BY RMC.LOAD_DTS DESC) = 1THEN 'Y'
ELSE 'N'END CURR_FLGFROMDW_STAGE.RMCODP_STG RMCWHERERMC.CODCODTYP = 'BG‘
#VirtualODS
BUT… Can’t Use Function In Where
© Data Warrior LLC
01.
Have to nest the query with the function as a virtual table in the FROM
02.
Then use CURR_FLAG in outer WHERE
03.
Works in Oracle, SQL Server, and SnowflakeDB
04.
Drop the final query into View Builder› Save› Generate DDL
#VirtualODS
Example: Virtual ODS View
© Data Warrior LLC
SELECTSRC.Business_Group_Code,SRC.Code_Key_Numeric,SRC.System_Value_Type,…SRC.Change_Time,SRC.LOAD_DTSFROM(SELECTCONVERT([Char](10),RMC.CODCODNUM) AS Business_Group_Code,RMC.CODKEYNUM AS Code_Key_Numeric,RMC.CODSYSTYP AS System_Value_Type,…RMC.CODCHGTIM AS Change_Time,RMC.LOAD_DTS AS LOAD_DTS,CASEWHEN RANK() OVER (PARTITION BY RMC.HASH_KEY
ORDER BY RMC.LOAD_DTS DESC) = 1THEN 'Y'
ELSE 'N'END CURR_FLG –- calculated columnFROMDW_STAGE.RMCODP_STG RMCWHERERMC.CODCODTYP = 'BG'
) SRC –- nested virtual tableWHERESRC.CURR_FLG = 'Y' –filter on calculated column
Nested Virtual Tablew/Rank column and other transforms
Get current rowsusing virtual column
Main select for view columns
#VirtualODS
Generate DDL
© Data Warrior LLC
Use DDL Preview to check
File > Export > DDL
Or click the DDL Icon
Pick the target DB type
Can switch at generate time
Same design can generate Oracle and SQL Server
#VirtualODS
Conclusion
© Data Warrior LLC
With planning and good architecture you can be agile
Data Vault provides a good framework
Oracle Data Modeler provides the tool
Think out of the box› Start with virtual ODS or Data Marts› Support for both Oracle & SQL Server› And Snowflake too!
1
2
3
4
#VirtualODS
Want More In Depth Training?
© Data Warrior LLC
SQL Developer Data Modeler JumpstartOnline video training class with demos
Discount code GRAZIANO10S (20%off)
Go to https://kentgraziano.com/sddm1/
© Data Warrior LLC
SUPER CHARGE YOUR DATA WAREHOUSE
› Available on Amazon.com
› Soft Cover or Kindle Format
› Now also available in PDF at LearnDataVault.com
› Hint: Kent is the Technical Editor
© Data Warrior LLC
New DV 2.0 Book (includes more details on MD5)
› Available on Amazon:http://www.amazon.com/Building-Scalable-Data-Warehouse-Vault/dp/0128025107/
CONTACTINFORMATION
KENT GRAZIANOSnowflake Computingwww.snowflake.net
@KentGraziano
http://kentgraziano.com