+ All Categories
Home > Documents > Strategic Data Management Strategic Data Management –– Cf ...

Strategic Data Management Strategic Data Management –– Cf ...

Date post: 03-Nov-2021
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
42
Strategic Data Management Strategic Data Management – C f i th Dt W h C f i th Dt W h Conforming the Data Warehouse Conforming the Data Warehouse S i M6 Session M6 September 24 2007 September 24, 2007 Andrea Matulick, Acting Manager, Business Intelligence, Planning and Assurance Services UniSA Planning and Assurance Services, UniSA Robert Davies, Technical Team Leader, Enterprise Data Warehouse, ISTS, UniSA Warehouse, ISTS, UniSA
Transcript
Page 1: Strategic Data Management Strategic Data Management –– Cf ...

Strategic Data Management Strategic Data Management ––C f i th D t W hC f i th D t W hConforming the Data WarehouseConforming the Data Warehouse

S i M6Session M6September 24 2007September 24, 2007

Andrea Matulick, Acting Manager, Business Intelligence, Planning and Assurance Services UniSAPlanning and Assurance Services, UniSA

Robert Davies, Technical Team Leader, Enterprise Data Warehouse, ISTS, UniSAWarehouse, ISTS, UniSA

Page 2: Strategic Data Management Strategic Data Management –– Cf ...

Strategic Data Management Strategic Data Management ––Conforming the data warehouseConforming the data warehouse

•• StrategyStrategy•• Data ManagementData Managementgg•• Conforming dataConforming data

Why do we need strategic data management?• To support the strategic planning cycle

T i i f d f di• To maximize performance and funding• To support integrated processes using technology• To become an analytic organisationTo become an analytic organisation

2

Page 3: Strategic Data Management Strategic Data Management –– Cf ...

The standard business Strategic Planning cycle:The standard business Strategic Planning cycle:

• Formulate the strategy (using decisions from evidence based data)based data)

• Communicate the strategy• Analyse scenarios• Prepare plans and budgets• Monitor, forecast, report against actual data

Feedback the es lts fo the ne t st ateg c cle• Feedback the results for the next strategy cycle

3

Page 4: Strategic Data Management Strategic Data Management –– Cf ...

Maximising performance and funding

Student Demand applications preferences TER

Measuring performance - examples of Key Performance Indicators

Student Demand applications, preferences, TER

Research Performance publications, research income, completions

Student Staff Ratio student load (EFTSL) staff FTE

Maximising funding - examples of funding formulae

Student Staff Ratio student load (EFTSL), staff FTE

Commonwealth Grant Scheme (CGS)

Funding Agreement total$ = Sum of (CGS student (EFTSL) per cluster x cluster funding rate$ )

Research Training HEP’s specific performance index = (HDR gScheme (RTS)

p p (completions x 0.5) + (Research Income x 0.4) + (Research Publications x 0.1)

Learning and Student demand (applications load)Learning and Teaching Performance fund (L&TPF)

Student demand (applications, load)Student experience (CEQ overall satisfaction, generic skills, good teaching) (GDS employment and further study)and further study)Student progression (success rate, retention rate, level of study)

Page 5: Strategic Data Management Strategic Data Management –– Cf ...

Which data is important?Which data is important?

• focus on key issues and critical facts and measures• do not overload with data from every aspect of the

organisation g• data used in key performance indicators and funding

formulae • all data that underlies those indicators, right down to , g

transactional level• additional external competitor and benchmarking data• standardised and conformed data

“the right information must be delivered to the rightthe right information must be delivered to the right people at the right time and in the right context”

5

Page 6: Strategic Data Management Strategic Data Management –– Cf ...

Why transactional systems cannot help implement strategy

• transactional systems collect and store data on day to day operational activities

• they do not focus on key issues and critical facts• data is siloed into source application areas (e.g.

students staff finance) not combined into logicalstudents, staff, finance) not combined into logical business processes

• data is not conformed (standardised) so that data may be aggregated across systems

• transactional data is not suitable for corporateanalysis reporting or dashboard applicationsanalysis, reporting or dashboard applications

6

Page 7: Strategic Data Management Strategic Data Management –– Cf ...

How can electronic decision support systems help with implementing strategy?

• tie together computer applications and business processestie together computer applications and business processes related to key measures

• be able to analyse and report on the results in a timely and accurate manneraccurate manner

• move beyond transactional processing systems, incorporate key managerial processes into technology by integrating applications and using decision support systems

“While 76 percent of executives cite strategic planning as theWhile 76 percent of executives cite strategic planning as the top management tool to improve long term performance and strengthen integration across an organisation, only 33

t f ti l t i d i i t t lpercent of executives use electronic decision support tools that could help them in managing performance” (Hackett, business survey 2002)

7

( , y )

Page 8: Strategic Data Management Strategic Data Management –– Cf ...

Electronic decision support systems – to help implement strategystrategy

System Major FunctionCorporate Performance M t t (CPM)

Integrate strategic planning documents, organisational processes, t t d ibilitiManagement system (CPM) targets and responsibilities

Dashboard system For senior managers to customise key views of performance in their area of responsibility

Benchmarking and Record and monitor key performance indicators against targetsBenchmarking and Scorecarding system

Record and monitor key performance indicators against targets, indicate success, failure and alerts

Budgeting and Forecasting system

Using actual data to model scenarios and predict future trends

Business Intelligence presentation layer

Enable user access to information via data, analysis and reports

Data Warehouse Reorganise transactional data into logical business models for corporate analysis and reporting needs. Combine data required for KPI’s, add value through external and conformed data, etc.

Metadata system Provide users context about the data, record data source, lineage, definitions, business rules, etc.definitions, business rules, etc.

Master Data Management system

Centrally store and maintain the major common data dimensions used by all areas of the organisation (e.g. org structure)

Data Quality system Conform and standardise common data dimensions across the Q y yorganisation, identify data errors and anomalies

Transactional systems Collect source data and process day to day transactions for the organisation

Page 9: Strategic Data Management Strategic Data Management –– Cf ...

How do resources affect strategic data management?

• transactional systems well yresourced, supported, upgraded regularly

• other functions relatively yunsupported financially, without sufficient experienced resources BI resourcesp

• management continually complains about lack of reports and analysis, p y ,lagging timeliness of information, consistency and standardisation of results and definitions, lack of forecasting and scenario planning, lack of

i d

• need to allocate appropriate resources to decision supportcompetitor data,

benchmarking, etc. resources to decision support systems

Page 10: Strategic Data Management Strategic Data Management –– Cf ...

How does the standard and integrity of data affect corporate decisions?

• “Data warehousing holds much promise to provide competitive advantage through derived business intelligence, but theadvantage through derived business intelligence, but the promise cannot be realised unless you ensure the integrity of your data. You must have end-to-end controls and the ability to identify data anomalies in source data from many

i l Th l i l foperational systems. These controls are an integral part of essential data management best practice.” (Maurer, IBM, DMReview.com, July 2007)

• Data Quality software products assist in identifying data quality issues, but cannot fix data

• Most organisations do not factor data quality resources into• Most organisations do not factor data quality resources into any of their plans, it takes a very low priority.

• Organisations do not realise that this omission may be producing poor data on which they are basing their strategicproducing poor data on which they are basing their strategic decisions.

10

Page 11: Strategic Data Management Strategic Data Management –– Cf ...

Why is conforming the data important?Why is conforming the data important?

• transactional systems contain very little data quality control

• free text fields mean questionable validity, consistency standardsconsistency, standards

• reports and analysis become increasingly difficult , a computer does not recognise data to be the same unless it is identical (e g ‘male’ and ‘M’ ‘1’ and ‘01’ areunless it is identical (e.g. male and M , 1 and 01 are not the same to a computer)

• companies using Master Data Management systems to maintain common data used by multiple transactional a ta co o data used by u t p e t a sact o aapplications

• consistency and standardisation of data and business yrules across an organisation essential for the quality and usefulness of corporate analysis and reports

11

Page 12: Strategic Data Management Strategic Data Management –– Cf ...

How does a data warehouse assist in organising, standardising and conforming data?

• A data warehouse reorganises transactional data into glogical business models for corporate analysis and reporting needs

• Combines data from multiple systems and external• Combines data from multiple systems and external data in a way that is meaningful to the business

• Uses standard business rules and conformed standard code sets

• Can only report across data from multiple systems if• Can only report across data from multiple systems if the dimensions are conformed and can be reused across the fact data

• Highlights the need for data quality frameworks and master data management systems to be part of the data management strategy

12

g gy

Page 13: Strategic Data Management Strategic Data Management –– Cf ...

Data Warehouse Integration MatrixData Warehouse Integration Matrix

Page 14: Strategic Data Management Strategic Data Management –– Cf ...

Data that conforms well - external National Research Performance data

• Research funding (RTS) and the research quality framework (RQF) use publications, income and completions to measure performancecompletions to measure performance.

• DEST provides national data on these measures as a series of reports in a spreadsheet.

• The data can be loaded into a data warehouse along with standard reference data to analyse trends, share, benchmarking, rankings etc.benchmarking, rankings etc.

• By combining the warehouse data in an OLAP cube, we can see how our university is performing in the sector.

14

Page 15: Strategic Data Management Strategic Data Management –– Cf ...

external National Research Performance dataexternal National Research Performance data

Page 16: Strategic Data Management Strategic Data Management –– Cf ...

external National Research Performance data

Page 17: Strategic Data Management Strategic Data Management –– Cf ...

Analysis vs ReportsAnalysis vs Reports

• The original spreadsheet report is static and only shows one measure at a time

• In the warehouse we can add State ATN and National• In the warehouse we can add State, ATN and National benchmarking totals , share, rankings

• The OLAP cube provides the ability to analyse the data rather than just look at one report at a time

• However this data is lagging by at least one year• However, this data is lagging by at least one year• We have good data quality, context, but not timeliness• Need to load our ‘live’ research data into theNeed to load our live research data into the

warehouse daily to help make good strategic decisions

17

Page 18: Strategic Data Management Strategic Data Management –– Cf ...

Data with conforming issues - Live Research Performance data

• need to see performance areas during the current year• need to see performance areas during the current year compared to previous years

• good and poor performing divisions and schools (org it )units)

• data comes from 3 transactional systems – Research Master, Finance One, and Empower HR

• the data warehouse design was successful in bringing together the data from the 3 systems

• however, the issue of non conforming data proved to be a problem in a number of areas, including org structure which was supposedly controlled from astructure which was supposedly controlled from a central ‘master’ file

18

Page 19: Strategic Data Management Strategic Data Management –– Cf ...

Live Research data – Publications per FTELive Research data Publications per FTE

Page 20: Strategic Data Management Strategic Data Management –– Cf ...

Org Unit code conforming issues – two transactional systems

20

Page 21: Strategic Data Management Strategic Data Management –– Cf ...

Some alternative thoughts on data qualitySome alternative thoughts on data quality…

“Blame everything on the source data and point out that fixing source systems is out of scope ”fixing source systems is out of scope.

“Only use BI tools that let users export the reports to Excel where they can play with the data and produce information that looks much more accurate.”

(McBurney, Senior Consultant, 2006)

21

Page 22: Strategic Data Management Strategic Data Management –– Cf ...

OverviewOverviewOverviewOverview

Alternative approach to managingAlternative approach to managingAlternative approach to managing Alternative approach to managing conformed data in the warehouse.conformed data in the warehouse.

2222

Page 23: Strategic Data Management Strategic Data Management –– Cf ...

Agenda/ContentsAgenda/ContentsAgenda/ContentsAgenda/Contents

BackgroundBackgroundBackgroundBackgroundChallenges for a warehouse startupChallenges for a warehouse startupAdditional challengesAdditional challengesThe two approachesThe two approachesThe two approachesThe two approachesOracle Warehouse BuilderOracle Warehouse BuilderOther toolsOther toolsOther toolsOther tools

2323

Page 24: Strategic Data Management Strategic Data Management –– Cf ...

University Of SA Data WarehouseUniversity Of SA Data WarehouseUniversity Of SA, Data WarehouseUniversity Of SA, Data Warehouse

The University data warehouse consists of data from source systems:The University data warehouse consists of data from source systems:Finance, HR, Student, Master data management system – ie. Org Unit dataR h d i i t ti t d t lit h ll dResearch administration system -- data quality challenged

Covers research and student related business areas consisting:Covers research and student related business areas consisting:10 fact tables2 snapshot fact tables

70 dimension tablesEnvironment: Oracle 9i Rel2, OWB Rel 1, Cognos ver7

2424

Page 25: Strategic Data Management Strategic Data Management –– Cf ...

Challenges in a normal EDW startChallenges in a normal EDW start--up phase:up phase:Challenges in a normal EDW startChallenges in a normal EDW start up phase:up phase:

Methodology and documentation standardsDesign and developing ETL technical infrastructureDesign and developing ETL technical infrastructureMaster Data Management System – one source of the truth

ie In-house built application to manage Org Unit dataie. In house built application to manage Org Unit data.Build the warehouseDesign and build the BI layerDesign and build the BI layer.

2525

Page 26: Strategic Data Management Strategic Data Management –– Cf ...

Extra challenges to addressExtra challenges to addressExtra challenges to addressExtra challenges to address

Requirement for low on going support

Some source systems include poor data qualitySome source systems include poor data quality

Business processes and political environmentBusiness processes and political environment not focused on data quality improvements

2626

Page 27: Strategic Data Management Strategic Data Management –– Cf ...

Containing the extra challenges through Containing the extra challenges through adaptive designadaptive design

How can we better manage the extra challenges?How can we better manage the extra challenges?

Establish Master Data Management systemEstablish Master Data Management system – one source of data

Don’t want Data Quality (DQ) issues to destroy these gains.

Therefore need a robust way to manage DQ issues in the warehouse with minimum impact and intervention.warehouse with minimum impact and intervention.

2727

Page 28: Strategic Data Management Strategic Data Management –– Cf ...

Containing the extra challenges through Containing the extra challenges through adaptive designadaptive design

Managing reference data with DQ issues.

Two approaches considered given our challenges:

1 Kimball recommended approach1. Kimball recommended approach

2. University of SA, Hybrid Approach

O acle Wa eho se B ilde Release 1 does not p o ide an a tomaticOracle Warehouse Builder Release 1 does not provide an automatic means to manage SCD dimension tables.

2828

Page 29: Strategic Data Management Strategic Data Management –– Cf ...

Kimball recommended approachKimball recommended approachKimball recommended approachKimball recommended approach

For an incoming fact row that has an unmatched dimensional value :For an incoming fact row that has an unmatched dimensional value :

automatically create a new dimension entry place holder as a result.

assume at a later date the dimension row which matches the placeholder will arrive and overwrite the placeholder with a full row of attributeswill arrive and overwrite the placeholder with a full row of attributes.

2929

Page 30: Strategic Data Management Strategic Data Management –– Cf ...

Kimball recommended approachKimball recommended approachKimball recommended approachKimball recommended approach

Advantages:Advantages:No Factual data is lost (?)Proven approach which works efficiently for large Fact tablesProven approach which works efficiently for large Fact tablesSome ETL tools do this work for you.

3030

Page 31: Strategic Data Management Strategic Data Management –– Cf ...

Kimball recommended approachKimball recommended approachKimball recommended approachKimball recommended approach

Disadvantages:Disadvantages: more than one source of data for the dimensionpotentially more than one source of the truth.

if the dimension is conformed then rubbish data is made available to all areas of the data warehouse, unless it is managed.areas of the data warehouse, unless it is managed.

If effective dating is involved, has the potential to corrupt contiguous date ranges.

If only part of a placeholder is available (ie the code and no Efft Date) from the Fact row then Fact record gets written to a log file and dim key set to unknown or Fact row is rejected completelyy j p y− Either case probably requires manual intervention to resolve.

3131

Page 32: Strategic Data Management Strategic Data Management –– Cf ...

UniSAUniSA hybrid approachhybrid approachUniSAUniSA hybrid approachhybrid approach

Capture UnknownsCapture UnknownsFor an incoming fact row that has an unmatched dimensional value :

1. Store the unmatched business code in the core Fact table. 2. The dimension surrogate key within the fact record is set to -1.

Hide business code from user reporting layer

3232

Page 33: Strategic Data Management Strategic Data Management –– Cf ...

UniSA hybrid approachUniSA hybrid approach –– Unmatched Business CodeUnmatched Business CodeUniSA hybrid approach UniSA hybrid approach Unmatched Business CodeUnmatched Business Code

Fact Surrogate_key | | Org Unit Bus Code | Org Code Key | Org Key Version | Fact Measure |

1001 | | GPB | -1 | 1 | 0.5 |1001 | | GPB | 1 | 1 | 0.5 |

1001 | | ITU | 1234 | 1 | 1.0 |

1001 | | ITU | 1234 | 2 | 0.9 |

3333

Page 34: Strategic Data Management Strategic Data Management –– Cf ...

UniSAUniSA hybrid approachhybrid approachUniSAUniSA hybrid approachhybrid approach

Reprocessing the Unknowns.Reprocessing the Unknowns.

At a later date: Copy the core fact rows into the staging table where the

business code exists and the correspondingsurrogate key = -1surrogate key 1

Reconcile against the dimension table in order to obtain a known key. Reuse existing transformation mappings

Merge the Fact record back into the core Fact tableMerge the Fact record back into the core Fact table

3434

Page 35: Strategic Data Management Strategic Data Management –– Cf ...

UniSA hybrid approachUniSA hybrid approach –– Reprocessed Business CodeReprocessed Business CodeUniSA hybrid approach UniSA hybrid approach Reprocessed Business CodeReprocessed Business Code

| Org Code Key | Org Key Version | Org Code | Org Description | Current Flag | Org Code Key | Org Key Version | Org Code | Org Description | Current_Flag

| -1 | 1 | | Unknown | Y

| 1234 | 1 | ITU | Info Tech | N

| 1234 | 2 | ITU | Information Tech | Y

3535

| 1235 | 1 | GPB | Grounds | Y

Page 36: Strategic Data Management Strategic Data Management –– Cf ...

UniSAUniSA hybrid approachhybrid approachUniSAUniSA hybrid approachhybrid approach

Advantages:Advantages:No Factual data is lost.The one data source controls the truth for each dimension table.Automatic poor data quality quarantineData quality issues peculiar to the given source system are not

propagated throughout the entire warehousepropagated throughout the entire warehouse.No ongoing maintenance overhead with potential accumulation of

rubbish data within dimensions.No need for an ever-expanding number of fix up scripts.DQ issues can be handled on a subject area basis, assisting in

prioritizationprioritization. Unknowns report per Fact subject area available for DQ department.

3636

Page 37: Strategic Data Management Strategic Data Management –– Cf ...

UniSAUniSA hybrid approachhybrid approachUniSAUniSA hybrid approachhybrid approach

Disadvantages:Disadvantages:Fact table requires extra processing on a regular basis in order to

reconcile the unknown dim keys.

Requires the raw business codes are present in the Fact (not necessarily visible for user reporting)visible for user reporting)

Possibly not suitable for very large Fact tables where DQ is an ongoing issue, > 10 million fact table records, but Ok for Uni data volumes.

3737

Page 38: Strategic Data Management Strategic Data Management –– Cf ...

Oracle Warehouse Builder (OWB)Oracle Warehouse Builder (OWB)Oracle Warehouse Builder (OWB)Oracle Warehouse Builder (OWB)

Low costLow costMappings automatically perform bulk insertsExcellent ETL auditing information availableProcess Flow allows forking of multiple database g p

sessions

3838

Page 39: Strategic Data Management Strategic Data Management –– Cf ...

Use Emphasis on GraphicsUse Emphasis on GraphicsUse Emphasis on GraphicsUse Emphasis on Graphics

3939

Page 40: Strategic Data Management Strategic Data Management –– Cf ...

OWB Process FlowOWB Process Flow -- controlcontrolOWB Process Flow OWB Process Flow -- controlcontrol

4040

Page 41: Strategic Data Management Strategic Data Management –– Cf ...

OWB Process FlowOWB Process Flow –– session forkingsession forkingOWB Process Flow OWB Process Flow session forkingsession forking

4141

Page 42: Strategic Data Management Strategic Data Management –– Cf ...

More alternative thoughts on data qualityMore alternative thoughts on data quality…

“Default null values to the word ‘unknown’. If anyone questions this point out that unknown is used liberally throughout all the source systems and is more useful than not knowing that it is unknown.”

“You will soon find that your information managementYou will soon find that your information management projects are being delivered on time and are no less accurate than the source systems”

(McBurney Senior Consultant 2006)

4242

(McBurney, Senior Consultant, 2006)


Recommended