+ All Categories
Home > Documents > Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter...

Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter...

Date post: 25-Dec-2015
Category:
Upload: elizabeth-little
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
49
Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson [email protected] www.svaltech.com SvalTech “Data Quality: The Accuracy Dimension”, Morgan Kaufmann, 2003 “Database Archiving: How to Keep Lots of Data for a Long Time", Morgan Kaufmann, 2008 Copyright SvalTech, Inc., 2010
Transcript
Page 1: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

Two Neglected Data Management Technologies

Data Profiling & Database Archiving

DAMA Phoenix Chapter9 September, 2010

Jack E. Olson

[email protected]

www.svaltech.com

SvalTech

“Data Quality: The Accuracy Dimension”, Morgan Kaufmann, 2003“Database Archiving: How to Keep Lots of Data for a Long Time", Morgan Kaufmann, 2008

Copyright SvalTech, Inc., 2010

Page 2: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

2

Why Two Topics

Copyright SvalTech, Inc., 2010

SvalTech

• Jack knows a lot about each of them– Wrote a book on data profiling– Wrote a book on database archiving– Built data profiling product: AXIO at Evoke– Built database archiving product: TITAN at NESI

• Both are post-operational data management technologies

• Neither is required for operational purposes

• Both have huge potential for return on investment– Cost savings– Operational performance improvements– Risk reduction

• Both are liked by compliance/governance folks

• Both are under-utilized in IT shops

Page 3: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

3

What I’m Going to Talk About

Copyright SvalTech, Inc., 2010

SvalTech

For each Technology,

• Define Technology• Identify Where It is Used• Show How it Works• Basic Solution Architectures• Show Business Case Basics• Why it is Neglected

Page 4: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

4

Data Profiling

SvalTech

Copyright SvalTech, Inc., 2010

Page 5: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

5

Books

Copyright SvalTech, Inc., 2010

SvalTech

• Data Quality: The Accuracy Dimension, Jack E. Olson, Morgan Kaufmann, 2003

• Executing Data Quality Projects” Ten Steps to Data Quality and Trusted Information (tm), Danette McGilray, Morgan Kaufmann, 2008

• Three Dimensional Analysis – Data Profiling Techniques, Ed Lindsey, Data Profiling LLC, 2008

• Data Quality Assessment, Maydanchik Arkady, Technics Publications, LLC, 2007

Page 6: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

6

Definition

Copyright SvalTech, Inc., 2010

SvalTech

Data Profiling is the application of data analysis techniquesto existing data for the purpose of determining the actual content, structure, and quality of the data.

Data

Profiling

metadata

accurate &

inaccurate

Data

accurate &

inaccurate

accurate

metadata

facts about

inaccurate

data

data quality

issues

Page 7: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

7

Where is Data Profiling Used?

Copyright SvalTech, Inc., 2010

SvalTech

• Enterprise Data Quality Improvement Program– Traditional Six Sigma Like Program– Recursive application of data quality assessment– Based on historical success of companies who have used it

• Support Consolidation of Databases after mergers and acquisitions– Dramatically reduce cost and time to complete projects– Improve quality of data in resulting system

• Support application renovation projects– Dramatically reduce time and cost to complete– Improve quality of data in resulting system

• Support data integration functions for data warehousing/ business intelligence data stores

– Develop processes to cleanse data in transit– Improve quality of data in information intelligence stores

Page 8: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

8

Data Quality Program

Copyright SvalTech, Inc., 2010

SvalTech

Data Profiling

Data Quality

Discovery

Data Quality

Correction

Data Quality

Prevention

Incident

Investigation

Name & Address

Cleansing

De-duplication

Re-verification

Data Quality

Monitoring

Business Process

Improvements

Application Software

Improvements

Data Quality Issue

Formulation &

Tracking

Page 9: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

9

Data Quality Discovery

Copyright SvalTech, Inc., 2010

SvalTech

Data Profiling

Incident

Investigation

DATA profile DQ issues

Suspicion

Of DQ problem

profiler review

process

review

DATA DQ issues

review

Page 10: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

10

Data Profiling Functions

Copyright SvalTech, Inc., 2010

SvalTech

• Column Examination– List all values in column with frequency of occurrence– Show high and low values– Determine true data type– Determine degree of uniqueness– Determine encoding patterns used, frequency of each pattern– Compute values: AVG, SUM, MEDIAN, STD DEVIATION

• Row Examination– Find all primary key candidates (single or multi-column)– Find intra-row column dependencies (find denormalization instances)– Find multi-column value relationships

• Value ordering rules• NULL value dependencies

• Multi-table Examination– Find matching columns across tables

• Match by column name, data type• Match by values

– Find primary/foreign key pairs (single and multi-column)– Determine 1-1, 1-M, 1-0, M-1, M-M, 0-1 rules– Find primary values not found in secondary tables

• Test User provided data rules

Page 11: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

11

Potential for Discovery of Value Problems

Copyright SvalTech, Inc., 2010

SvalTech

missing & invalid

values

valid but wrong

values

correct values

errors that can be

found through

analytical techniques

errors that can be

fixed without

re-verification

values recorded

in a column of a

table

Page 12: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

12

What Types of Problems can be identified for values in a column?

Copyright SvalTech, Inc., 2010

SvalTech

Invalid Values - missing values when should not be missing

- values out of range or not in domain of expected values

- value in one column not possible when combined with

values in one or more other columns

- obviously wrong when looked at•Name = Donald Duck•Address = 1600 Pennsylvania Avenue

Valid Values - distribution of values unexpected

too many of one or more values

too few of one or more values

- value is incompatible with other values in other columns

- multiple values mean same thing

Page 13: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

13

What Types of Problems can be Fixed?

Copyright SvalTech, Inc., 2010

SvalTech

• Synonyms (multiple representations for same value)

• Multiple ways to express same value

•Date formats

•Number formats

•Use of case on character values

• Cases where the value of a column can be determined from the

values in one or more other columns

explicitly: through a rule

correlation against known combinations

Page 14: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

14

Role of Metadata and Data Rules

Copyright SvalTech, Inc., 2010

SvalTech

• must define what constitutes correct

• traditional metadata is only part of definition

• traditional metadata is often inaccurate and/or incomplete

• data rules exist whether known or not

• data rules exist whether enforced or not

• profiling can validate metadata and known data rules

• can be used to correct or enhance metadata and data rules

• can be used to discover additional data rules

• can test adherence to data rules

Page 15: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

15

Data Profiling Solution Architectures

Copyright SvalTech, Inc., 2010

SvalTech

metadata

data rules

data

extract

samples

gather metadata

& data rules

profilerprofile

reviewupdate metadata

and data rules

Identify and quantify problems

determine impact on business

create DQ problem tickets

Page 16: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

16

Solution Architectures

Copyright SvalTech, Inc., 2010

SvalTech

Home Grown Vendor Tools

Inefficient

Directed SQL; not generic

Easy Algorithms

Few Surprises uncovered

More sophisticated functions

third normal form discovery

foreign key discovery

multiple pattern recognition

More management of results

Page 17: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

17

State of Vendor Support

Copyright SvalTech, Inc., 2010

SvalTech

• Fragmented– Different products call themselves data profiling but do different things– Some non-profiling products call themselves profiling

• Buried in multiple data management functions

• Marketed as compliance product

• Not a lot of advances in technology over last few years– More complex array of profiling functions– Targeted profiling functions– Menu of functions to pick from– SaaS– Profiling of metadata– Monitoring operational systems for profiling expectations

Page 18: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

18

Business Case Basics

Copyright SvalTech, Inc., 2010

SvalTech

• Dollar savings– Fewer Operational Mistakes

• Reduced rework• Reduced customer returns

– Improved Analytic Results• Better decision making from Analytics

• Improved Operations– Fewer operational glitches due to wrong data values– Reduced time to complete projects

• Risk Reduction– Reduced exposure to catastrophic events– Reduced exposure to legal scrutiny of data

It’s all indirect:

you cannot determine

what value you will get

until after you have

profiled data, studied

results and used it to

some advantage.

Data Quality problems can cost

a company 15-25% of bottom line profit.

Page 19: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

19

How to Get Started

Copyright SvalTech, Inc., 2010

SvalTech

• Need to establish as part of quality improvement/ compliance/ governance program

– 6 Sigma like– Multi-department interest

• Start with a pilot data quality assessment of a critical data resource– Small– Targeted: high value consequences of poor data– “see what we find”

• Possibly start with a renovation/ consolidation/ integration project– Process to establish source level metadata and understand data content

• Need to Quantify Results

• Use Success to Expand Use of Technology

Page 20: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

20

Why is it Neglected?

Copyright SvalTech, Inc., 2010

SvalTech

• Problems are not on Top Ten List of IT Concerns

• Cannot predict value: cannot build business case in advance– Don’t know value of a data quality improvements until after you discover the problems– Value is hard to quantify: does not appear until quality problems acted upon

• Perception that you don’t need it

• Exposes corporate/ IT weaknesses

• No Clear Owner of Data Quality

• Lack of education on value, cost to implement– Management education– Technical Staff education

Page 21: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

21

Database Archiving

SvalTech

Copyright SvalTech, Inc., 2010

Page 22: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

22

Books

Copyright SvalTech, Inc., 2010

SvalTech

• Database Archiving: How to Keep Lots of Data for a Very Long Time, Jack E. Olson, Morgan Kaufmann, 2009

Page 23: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

23

Definition

Document Archiving word pdf excel XML

File Archiving structured files source code reports

Email Archiving outlook lotus notes

Database Archiving DB2 IMS ORACLE SAP PEOPLESOFT

Physical Documents application forms mortgage papers prescriptions

Multi-media files pictures sound telemetry

The process of removing selected data items from operational databases that are not expected to be referencedagain and storing them in an archive database where they can be retrieved if needed.

SvalTech

Copyright SvalTech, Inc., 2010

Page 24: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

24

Business Records: the Archive Unit

SvalTech

You don’t archive databases; you archive data from databases.

A Business Record is the data captured and maintained for a single business event or to describe a single real world object.

Databases are collections of Business Records.

Database Archiving is Records Retention.

customer employee stock trade purchase order deposit loan payment

Copyright SvalTech, Inc., 2010

Page 25: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

25

Data Retention

SvalTech

The requirement to keep data for a business object for a specified period of time. The object cannot be destroyed untilafter the time for all such requirements applicable to it has past.

Business Requirements

Regulatory Requirements

The Data Retention requirement is the longest of all requirement lines.

Copyright SvalTech, Inc., 2010

Page 26: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

26

Data Retention

SvalTech

• Retention requirements vary by business object type

• Retention requirements from regulations are exceeding business requirements

• Retention requirements will vary by country

• Retention requirements imply the obligation to maintain the authenticity of the data throughout the retention period

• Retention requirements imply the requirement to faithfully render the data on demand in a common business form understandable to the requestor

• The most important business objects tend to have the longest retention periods

• The data with the longest retention periods tends to be accumulate the largest number of instances

• Retention requirements often exceed 10 years. Requirements exist for 25, 50, 70 and more years for some applications

Copyright SvalTech, Inc., 2010

Page 27: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

27

Where Database Archiving Is Used

SvalTech

Copyright SvalTech, Inc., 2010

• Overloaded Operational Databases

• Retired Applications

• Application Renovation Projects

Page 28: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

28

Overloaded Operational DatabasesSvalTech

• Transaction data• Lots of data

– Hundreds of millions of rows– High daily transaction rate

• 24/7 operational availability requirement• Long retention period (15 years or more) • Short useful active life (less than 2 years)• Low access requirements during the inactive period

– Very low access frequency– Response time not critical– Access requirements are simple, easily satisfied with ad hoc tools

Copyright SvalTech, Inc., 2010

Page 29: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

29

Retired ApplicationsSvalTech

• Merger of companies results in an operational application being duplicated

• Data Structures are not compatible– One keeps data elements not in other– One encodes data elements differently– One designed for different OS/DBMS than other

• Decision is made to use one system and abandon the other one

• Meets all characteristics of an operational application

Copyright SvalTech, Inc., 2010

Page 30: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

30

Application Renovation ProjectsSvalTech

• Application is undergoing major change– Replaced with packaged application– Legacy modernization– Legacy termination– Rewritten to be web-centric– Need to satisfy new requirements

• Old data structures are out of date– Legacy DBMS– Legacy file system

• Data meets all other requirements for archiving operational application

Copyright SvalTech, Inc., 2010

Page 31: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

31

Architecture of Database Archiving

Archive Server

Operational System

archive catalog

archive storage

OP DB

Archive AdministratorArchive DesignerArchive Data ManagerArchive Access Manager

SvalTech

Archive Extractor

Application program

Archive extractor

Copyright SvalTech, Inc., 2010

Page 32: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

32

Archive StaffSvalTech

• Database Archive Specialist– Received education on database archive design and implementation– Knows tools available– Experienced– Full time job

• Database Archive Administrator– Received education on database archiving administration– Full time job

• Supporting Roles– Storage Administrators– Database Administrators– Data Stewards– Security Administrators– Compliance staff– IT management– Business Unit Management– Legal– Records Management

Copyright SvalTech, Inc., 2010

Page 33: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

33

Archive Designer ComponentSvalTech

• Metadata– Capture current metadata– Validate it– Enhance it– Design archive storage format

• Data– Define business records to be archived– Define source of data– Define data structures within operational system– Define reference data needed to include with it– Define archive format of data

• Policies– Define extract policy (when a record becomes inactive)– Define operational disposal policy (when to remove from operational database)– Define storage policy (how to protect data in archive)– Define discard policy (when to remove from archive)

Copyright SvalTech, Inc., 2010

Page 34: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

34

Archive Extractor ComponentsSvalTech

• Extractor process– Verify consistency with design metadata– Extract data as defined in designer– Mark or delete from operational database as defined in designer– Pass data to archive data manager– Keep audit records on everything done– Do not impact operational performance– Support interruptions with transaction level recovery– Support restart– Finish scans within acceptable time periods

• Scheduling– Establish periodic executions– Find non-disruptive periods– Be consistent

Copyright SvalTech, Inc., 2010

Page 35: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

35

Archive ExtractorsPhysical vs. Application Extractors

SvalTech

Copyright SvalTech, Inc., 2010

Operational System

OP DB

Archive Extractor

Application program

Archive extractor

Physical ExtractorGets/deletes data directly from the database

tables, rows, columns

Application ExtractorGets/deletes data from an application API

virtual tables, rows, columns

application program

Page 36: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

36

Archive Data Manager ComponentSvalTech

• Put data away– Receive data from extractors– Format into archive segment files– Determine metadata version affinity– Format and store metadata files if new– Build or update segment indexes both internal and external

• Execute Storage policies– Encryption/ signatures– Backup copies created and stored– Geographic dispersion of backups– Register archive files with archive catalog– Enter audit trail information

• Fetch metadata on request– Return to accessing programs

• Fetch data on request– Scan archive segments– Search through indexes

• Execute Archive Discard Process– Periodic scheduling– Delete qualifying business records– Update archive catalog

Copyright SvalTech, Inc., 2010

Page 37: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

37

Archive Access ComponentSvalTech

• Query Capability– Determine applicability based on archive segment versions of metadata– SQL based is best, if possible– Employ external indexes to determine which archive segments to look into– Employ internal indexes to avoid reading all of an archive segment

• Support standard access tools– Report generation (such as Crystal Reports)– Generic query tools– JDBC interface

• Support metadata version browsing

• Support generation of load files based on query results

• Support generation of load files based on original data source based on query results

Copyright SvalTech, Inc., 2010

Page 38: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

38

Archive Administration ComponentSvalTech

• Manage Archive Catalog– Application archive designs– Audit trails– Results logs

• Manage Archive Storage Systems– Ensure periodic readability checks– Maintain access audit trails

• Manage Archive Access– Authorizations for users– Authorizations for specific events

• Unloads– Ensure audit records are created for all access

• Manage e-Discovery requests

• Ensure Extract and Discard processes are run when they are supposed to

• Manage Metadata Change Process

Copyright SvalTech, Inc., 2010

Page 39: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

39

SvalTech

Home-Grown Solutions:

Use Parallel DB Use Database Partitions Put in UNLOAD files Save Image Copies of DB

Copyright SvalTech, Inc., 2010

Solution ComparisonsHome-Grown vs. Vendor

Vendor Solutions:

More Complete Solutions Support Long Term Administration Put data in XML files Put data in reformatted files Exploit strengths of storage subsystems Direct access support

Page 40: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

40

SvalTech

Copyright SvalTech, Inc., 2010

Home-Grown Solutions• Solve Operational Problems, BUT:

– Create downstream problems

– Fail to achieve cost savings

– Render archive data inaccessible

• Either completely or,

• Expensive in time and cost to query

– Lose data authenticity

• Common Omissions– No handling or improvement of metadata

– No change process for structure changes

– No long term storage management

– Fail to achieve application/system independence

– No administration platform

Page 41: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

41

SvalTech

Copyright SvalTech, Inc., 2010

Vendor Solutions• Not a Lot of Vendors

– Only 7 I know of– 3 large companies

• Through acquisition– Gartner pre-recession characterization

• Is a new technology• $100M in 2008• 40% per year growth rate• Early adopter stage

• Solutions not complete– Need growth in function and maturity– Common weak spots

• Design modeling• Extractor technology• Not pervasive across data sources• Storage structure• Storage management

Page 42: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

42

Business Case BasicsDrivers

SvalTech

overloadedoperationaldatabases

Longer Data Retention requirements

Expanded Business

Mergers and Acquisitions

Copyright SvalTech, Inc., 2010

Operational problems

Data Governancee-Records Retentione-Discovery Readiness concerns

Difficulty in Making Application Changes

Cost of Keeping Old Systems

Page 43: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

43

Reason for Archiving

Operational operational archive

All data in operational db

most expensive system most expensive storage most expensive software

Inactive data in archive db

least expensive system least expensive storage least expensive software

In a typical op db60-80% of datais inactive

This percentageis growing

SvalTech

Size Today

Copyright SvalTech, Inc., 2010

Page 44: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

44

Cost Saving ElementsSvalTech

Copyright SvalTech, Inc., 2010

Look for and compute difference in storage costs

front-line vs archive storage

byte counts differences between operational and archive

Look for and compute difference in system costs

operational vs archive systems

are operational system upgrades avoided

are software upgrades avoided

can systems be eliminated for application

can software be eliminated for application

Look for savings on people costs

can people be eliminated or redirected for retired applications

Potential savings on changes/ application renovations

simplification of design

elimination of data conversions

Page 45: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

45

Operational Efficiency ImpactsSvalTech

Copyright SvalTech, Inc., 2010

Will operational performance be enhanced with less data

Will utility time periods be reduced (backup, reorganization)

fewer occurrences needed

less data to process each time

Will recovery times be reduced and what is that worth

interruption recoveries

disaster recoveries

Will implementation of data structure changes be improved

avoided

reduced amount of data to unload/modify/reload

Page 46: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

46

Risk FactorsSvalTech

Copyright SvalTech, Inc., 2010

Will the saved data have better authenticity

not changed in archive

shielded from updates or damage

traceable back to original form

Will e-Discovery benefit from archiving

can locate and process data outside of operational environment

can easily create legal-hold archive units

Will exposure of data reduced

fewer authorized users against the archive

complete audit trails of all access

Page 47: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

47

Business Case SummarySvalTech

Copyright SvalTech, Inc., 2010

• Database Archiving solutions generally provide for lower cost software,

can use lower cost storage more efficiently, and run on smaller machines.

• Each business case is different

Many factors can be used in building business case

Seen an application justified on storage costs alone

Seen an application justified on disaster recovery time alone

Seen an application justified on better data security alone

• Each organization will have many potential applications

• Having a database archiving practice can create synergies across many

applications thus adding more value

Page 48: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

48

Why is it Neglected?

Copyright SvalTech, Inc., 2010

SvalTech

• Belief that work arounds are good enough

• Technology is too new

• Technology is incomplete or flawed

• Inability to produce complete business case to justify

• Lack of experience in using technology

• Lack of education on value, cost to implement– Management education– Technical Staff education

Page 49: Two Neglected Data Management Technologies Data Profiling & Database Archiving DAMA Phoenix Chapter 9 September, 2010 Jack E. Olson jack.olson@SvalTech.com.

49

Final ThoughtsSvalTech

Copyright SvalTech, Inc., 2010

Are these two technologies related?

Enterprises who consider data a critical and valuable asset and who have a goal

of collecting, protecting, preserving and deploying data to the best advantage

of the enterprise WILL have robust data management and data governance practices.

These will include professional management of:

- data and metadata architectural control

- active and ongoing data quality improvement program

- active and ongoing data archiving program (records management)

- data security control

- access authorization and auditing

- data encryption

- data protection program

- backups

- disaster recovery

- protection from unauthorized access/ changes

- data distribution control


Recommended