+ All Categories
Home > Documents > Data Quality

Data Quality

Date post: 29-Oct-2014
Category:
Upload: michael-collins
View: 962 times
Download: 3 times
Share this document with a friend
Description:
A view of the importance of data quality and how to set about addressing this issue in your business
37
© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk Making your data a Making your data a strategic asset” strategic asset” Data Quality - The Key to Successful Analytics Data Quality - The Key to Successful Analytics and CRM and CRM Michael Collins Michael Collins BA(Hons), DipM, MCIM, BA(Hons), DipM, MCIM, FIDM FIDM Managing Consultant - Database Marketing Managing Consultant - Database Marketing Counsel Counsel Visiting University Lecturer in Database Visiting University Lecturer in Database Marketing & CRM Marketing & CRM DATABASE MARKETING COUNSEL
Transcript
Page 1: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

““Making your data a strategic asset”Making your data a strategic asset” Data Quality - The Key to Successful Analytics and CRMData Quality - The Key to Successful Analytics and CRM

Michael CollinsMichael Collins BA(Hons), DipM, MCIM, FIDMBA(Hons), DipM, MCIM, FIDM

Managing Consultant - Database Marketing CounselManaging Consultant - Database Marketing CounselVisiting University Lecturer in Database Marketing & CRMVisiting University Lecturer in Database Marketing & CRM

DATABASEMARKETINGCOUNSEL

Page 2: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Our AgendaOur Agenda

• The importance of understanding your core systemsThe importance of understanding your core systems • How clean should your data be?How clean should your data be? • The impact of a data auditThe impact of a data audit • Where does poor data come from? Where does poor data come from? • Methodology for improvementMethodology for improvement• Getting employees to ‘live and breath’ data qualityGetting employees to ‘live and breath’ data quality

Page 3: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data as a corporate assetData as a corporate asset

• Value to the businessValue to the business

• Invest in the maintenance of an assetInvest in the maintenance of an asset

• Responsibility of all who use it, access it, are Responsibility of all who use it, access it, are involved in its acquisition, storage or involved in its acquisition, storage or maintenancemaintenance

• Rules of managementRules of management

• ValidationValidation

• SecuritySecurity

• An appreciating asset – in everyone’s interestAn appreciating asset – in everyone’s interest

Page 4: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Four kinds of quality issues

• Common data-entry errorsCommon data-entry errors

• Out of date – past its “use-by” dateOut of date – past its “use-by” date

• Lack of consistencyLack of consistency

• Unreliable sourcesUnreliable sources

Poor quality and integrity of data limits its valuePoor quality and integrity of data limits its value

Page 5: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Implications of Inaccurate DataImplications of Inaccurate Data

• There is no substitute for acquiring accurate data There is no substitute for acquiring accurate data - analysis tools can’t compensate for lack of data- analysis tools can’t compensate for lack of data

• The more “real time” contact we have with The more “real time” contact we have with customers, suppliers or employees, the more customers, suppliers or employees, the more accurate the data needs to be and the more accurate the data needs to be and the more devastating can be the results of inaccuracydevastating can be the results of inaccuracy

• Quality will determine how much of a guide or Quality will determine how much of a guide or ‘black & white’ analysis can be reached‘black & white’ analysis can be reached

Page 6: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Implications of Inaccurate DataImplications of Inaccurate Data

• Skewed campaign planningSkewed campaign planning• Improper selections for campaignsImproper selections for campaigns• Expensive product mistakesExpensive product mistakes• Non-delivery of the message (esp. E-mail)Non-delivery of the message (esp. E-mail)• Reflection of your businessReflection of your business• ‘‘Junk mail/spam’ tagJunk mail/spam’ tag• £££££ Wasted£££££ Wasted

Page 7: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Implications of Inaccurate DataImplications of Inaccurate Data

Inaccuracy will Inaccuracy will

annoy customers, annoy customers,

suppliers and staffsuppliers and staff

Page 8: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data Quality Quality DataData Quality Quality Data

1 Profile1 Profile UnderstandingUnderstanding

2 Audit2 Audit QualificationQualification

3 Integrate3 Integrate ConsolidationConsolidation

4 Enrich4 Enrich ImprovementImprovement

5 Monitor5 Monitor ObservationObservation

6 Culture6 Culture ComplianceCompliance

Page 9: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Typical FrameworkTypical Framework

Source A

Source B

Source C

Sources Extract/Transform/Load Processes

Operational CRMCampaign Management

External Data

BI & Visualisation

Rules

DATA QUALITY

Page 10: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Typical FrameworkTypical Framework

Source A

Source B

Source C

Sources Extract/Transform/Load Processes

Operational CRMCampaign Management

External Data

BI & Visualisation

Rules

QUALITYDATA

Page 11: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Profiling Your SourcesProfiling Your Sources

• Current business processesCurrent business processes

• Tactical activityTactical activity

• Enhancement from external sourcesEnhancement from external sources

• Business information vendorsBusiness information vendors

• Purchased listsPurchased lists

• Marketing partnersMarketing partners

Page 12: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

SourcesSources

WARRANTY

SURVEYS - Behavioural

ENQUIRIES/HELP LINE

DATABASE

SALES

COMPLAINTSBRANCHES

/CHANNELS

ACCOUNTS

OTHER TOUCH POINTSSMSSocial Networking

EXTERNAL

Page 13: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Scoring the SourcesScoring the Sources

• Score the data as part of your data strategyScore the data as part of your data strategy• Build a model that provides a level of confidenceBuild a model that provides a level of confidence• Base the model on known factorsBase the model on known factors

— SourceSource— Recency of updateRecency of update— TestingTesting

• Use the score to determine priorities for enhancement Use the score to determine priorities for enhancement and to inform the business of the level of confidenceand to inform the business of the level of confidence

• Strive to improve the level of confidenceStrive to improve the level of confidence

Page 14: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Review Your ScoresReview Your Scores

0-20 21-50 50-60 60-70 70-80 80-90 90-95 95-100Total

recordsNo. of employees 12,369 134 98 167 698 469 39,870 679 54,484Industry Sector 457 32 8 0 0 0 0 53,987 54,484Turnover 457 32 8 0 0 0 0 53,987 54,484Growth 12,670 34,678 0 0 30 35 524 6,547 54,484

% Confidence

Business Services Company

Page 15: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Compare Your ScoresCompare Your Scores

Business Services Company – No. of Employees

% Confidence 0-20 20-50 50-60 60-70 70-80 80-90 90-95 95-100Audit records 12369 134 98 167 698 469 39870 679Audit % 23 0 0 0 1 1 73 1Universe 135698 179 129 11356 2987 1500 673 1598123Universe % 8 0 0 1 0 0 0 91

Page 16: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Typical FrameworkTypical Framework

Source A

Source B

Source C

Sources Extract/Transform/Load Processes

Operational CRMCampaign Management

External Data

BI & Visualisation

Rules

HIERARCHY

Page 17: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data Quality ProcessData Quality Process

• Data Audit -technologyData Audit -technology• What needs fixingWhat needs fixing• What needs summarizingWhat needs summarizing• What derived data is requiredWhat derived data is required• Attrition - data use by date!Attrition - data use by date!• How do you fix and improve it How do you fix and improve it

— external enhancement, internal technologyexternal enhancement, internal technology• Data business rulesData business rules• What to do while you are fixing it!What to do while you are fixing it!• Keeping it fixed – monitor and enhance!Keeping it fixed – monitor and enhance!

Page 18: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data AuditData Audit• Appraise the dataAppraise the data

— Technology for auditing the dataTechnology for auditing the data— Do fields hold what they claim to hold?Do fields hold what they claim to hold?— Is it in a usable formatIs it in a usable format

• For operations?For operations?• For analytics?For analytics?

— How extensively populated are the fields?How extensively populated are the fields?

• Ascertain the age of the data – has it passed its ‘USE BY Ascertain the age of the data – has it passed its ‘USE BY date’?date’?

• What needs to be done to make this data usable/valuableWhat needs to be done to make this data usable/valuable

Page 19: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data Audit TechnologyData Audit TechnologyTools to report on the quality of data - attention is drawn to those fields that require analysis.Tools to report on the quality of data - attention is drawn to those fields that require analysis.

Against each column name Against each column name • Minimum Value and Maximum Value Minimum Value and Maximum Value • Mean, Median, ModeMean, Median, Mode• Minimum Length and Maximum Length Minimum Length and Maximum Length • Mean and Mode Length Mean and Mode Length • Defined Data Type and number/% records that conflictDefined Data Type and number/% records that conflict• % populated with valid characters (excluding spaces)% populated with valid characters (excluding spaces)• Number of unique valuesNumber of unique values

Page 20: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data Report – Logistics CompanyData Report – Logistics Company

Example of some of the data irregularities identified – addresses in the name field, addresses and Example of some of the data irregularities identified – addresses in the name field, addresses and postcodes in the Town field, lower case characters, invalid postcodes etcpostcodes in the Town field, lower case characters, invalid postcodes etc

What lies underneath?

Page 21: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Drill Down to FormatDrill Down to Format

Data Format No of Records Sample of Data XX## #XX 1203 AB12 3AB

XX##X #XX 63 AB12A 3AB

XX# #XX 2014 AB1 3AB

XXXXX#XXX 1203 ABFDA1ABC

Postcode

Data Format No of Records Sample of Data ##### ###### 21003 01932 124689

#### ### #### 1095 0115 236 1236

##### ###### XXXX### 2014 01892 226819 ext.354

XX XXX XXXX 54 Do Not Call

Telephone Number

Page 22: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

How to Fix what needs Fixing!How to Fix what needs Fixing!

• Internal processesInternal processes— Data cleansingData cleansing— CorrectionsCorrections— Use of address enhancement software Use of address enhancement software — Use of touch-pointsUse of touch-points— Use of people in the business who knowUse of people in the business who know— Source evaluationSource evaluation

Page 23: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Internal processesInternal processes

Common functions of data cleansing technology Common functions of data cleansing technology • Find and ReplaceFind and Replace• Standardisation: compare values from the given Standardisation: compare values from the given

column with a column in a compiled Knowledge column with a column in a compiled Knowledge Base e.g. list of TitlesBase e.g. list of Titles

Input StandardMr MrMr. MrMister MrMrs MrsMrs. MrsMiss MsMs MsMs. MsDr DrDr. DrDoctor DrProf ProfProfessor ProfProf. ProfFather FatherFr FatherFr. FatherSgt SgtSergeant SgtCol ColonelCol. ColonelColonel ColonelLieut. LtLieutenant LtLt. LtLt Lt

Page 24: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Internal processesInternal processes

Common functions of data cleansing technology Common functions of data cleansing technology • Find and ReplaceFind and Replace• Standardisation: compare values from the given Standardisation: compare values from the given

column with a column in a compiled Knowledge column with a column in a compiled Knowledge Base e.g. Job TitlesBase e.g. Job Titles

Input StandardPersonnel Manager HR ManagerStaff Manager HR ManagerHR Manager HR ManagerHuman Resources Manager HR ManagerStaff Development Manager HR ManagerAccounts Clerk Accounts ClerkAccounts Officer Accounts ClerkAccounts Accounts ClerkAccounts Receivable Accounts ClerkPetty Cash Clerk Accounts ClerkManaging Director CEO/MDMD CEO/MDCEO CEO/MDChief Executive Officer CEO/MDMan Dir CEO/MDMng Director CEO/MDSecretary to MD CEO/MDPA to Sales Director PA/SecPersonal Assistant PA/SecSecretary ???

Page 25: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Common functions of data cleansing technology Common functions of data cleansing technology • Find and ReplaceFind and Replace• Standardisation: compare values from the given column Standardisation: compare values from the given column

with a column in a compiled Knowledge Base e.g. Job Titleswith a column in a compiled Knowledge Base e.g. Job Titles• Data split: Divide data in a single field into multiple fields Data split: Divide data in a single field into multiple fields

e.g. Mr John Smith to be divided into three fields of Title, e.g. Mr John Smith to be divided into three fields of Title, First Name and SurnameFirst Name and Surname

• De-duplication and Merge/purge De-duplication and Merge/purge • Case conversionCase conversion• Address technology – correction, replacement, batch and Address technology – correction, replacement, batch and

interactiveinteractive

Internal processes

Page 26: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

External ProcessesExternal Processes• Bureau servicesBureau services

— Name and address enhancementName and address enhancement

— Verification, insertion, correctionVerification, insertion, correction

— Data augmentation Data augmentation

— Telephone services – calling to correct detailsTelephone services – calling to correct details

— Dynamics – B2CDynamics – B2C

• National Change of AddressNational Change of Address

• Gone Away SuppressionGone Away Suppression

• Mortascreen Plus (Grey market)Mortascreen Plus (Grey market)

• Mortascreen Mortascreen

• Bereavement registerBereavement register

—Dynamics – B2BDynamics – B2B

•Mergers & acquisitionsMergers & acquisitions

•Job changesJob changes

•Status changeStatus change

•Purchasing strategy (central/local)Purchasing strategy (central/local)

•Official name/colloquial nameOfficial name/colloquial name

•Business demographicsBusiness demographics

Page 27: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Company NamePostcodeBusiness DemographicsSectorRegistration CodeAdvertising spend Job Function Job Title Turnover Product/Service

1. Business demographics:Enhancement /verification

2. PAF data (UK & Foreign)Address verification & formatting

3. Weather/Travel Info

Exhibitions organiser

4. Advertising MonitoringMarket share, expenditure comparison

5. Sector performance

External Data ExampleExternal Data Example

Page 28: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Multi-source: Multi-source: The strength of “blended” dataThe strength of “blended” data

Source A

Sou

rce

B

No. of Employees in the company

Page 29: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data Business RulesData Business Rules• Business rules manage the validation process and the ongoing protection of data qualityBusiness rules manage the validation process and the ongoing protection of data quality

• Make your rules as stringent as you can to begin, then assess the volumes of rejects and Make your rules as stringent as you can to begin, then assess the volumes of rejects and adjust accordinglyadjust accordingly

• Quarantine offenders Quarantine offenders

• Impose rules on internal data acquisitionImpose rules on internal data acquisition

• Ensure they are included in the brief for any external data capture resourcesEnsure they are included in the brief for any external data capture resources

Page 30: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Example of Business RulesExample of Business RulesDatabase attribute

Title From listForenameSurname MandatoryAddress line 1 MandatoryAddress line 2TownCountyPostcode MandatoryCountry From listTelephone numberE-mail address Must contain @Date of birth Set formatMarital status From listGender From listNumber of children <18Ref Number System GeneratedSource From list

Use PAF file to populate address

Business Rules

Page 31: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Data StrategyData Strategy

Data is volatile Data is volatile

A data strategy is required for keeping it up to dateA data strategy is required for keeping it up to date

• DocumentedDocumented

• MaintainedMaintained

• ReviewedReviewed

• Internal & external dataInternal & external data

Page 32: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Enterprise Data Maturity ModelEnterprise Data Maturity Model

Local GlobalLocal Collectively

Local GlobalGlobal Global

Undisciplined Reactive Proactive Governed

Think

Act

Benefit

High

Low

Risk

Low

High

Data Governance

Direct Marketing

Database Marketing &Sales ForceAutomation

Data Warehousing•Enterprise•Project•Explorer•Marts

ERP

CRM•Operational•Analytical•Collaborative

Customer DataIntegration

Product DataIntegration Master Data

Management

Business ProcessManagement

Business Intelligence

ServiceOrientatedArchitecture

Page 33: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

AcquisitionAcquisition RetentionRetention UtilisationUtilisation

What is mostWhat is most What is mostWhat is most What is reliable?What is reliable?useful? 80/20useful? 80/20 easily available/easily available/

done?done?

Costs

You cannot do it all overnightYou cannot do it all overnight

Any enhancement to the data Any enhancement to the data mustmust

be driven by commercial benefitbe driven by commercial benefit

Remember the Real WorldRemember the Real World

Page 34: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Importance Accepted in the BusinessImportance Accepted in the Business• Lip service to business as usualLip service to business as usual• Incentives or penaltiesIncentives or penalties• Demonstration of the implications of poor data and/or how it Demonstration of the implications of poor data and/or how it

makes them more effective at their jobmakes them more effective at their job• Ensure they know how important it is that they complyEnsure they know how important it is that they comply• Make it easy for them to adhere to the rulesMake it easy for them to adhere to the rules• Listen to them and address their problems – their view of poor Listen to them and address their problems – their view of poor

quality data may be different to yoursquality data may be different to yours• Be prepared to changeBe prepared to change

— Software amendmentsSoftware amendments— Business processesBusiness processes— Forum or reporting channel for data issuesForum or reporting channel for data issues

Page 35: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Touch PointsTouch Points

• Encountering customers as part of regular business processes Encountering customers as part of regular business processes - the touch points- the touch points

• Opportunities forOpportunities for

— Acquiring new dataAcquiring new data

— Qualifying existing contactsQualifying existing contacts

— Verifying or updating existing dataVerifying or updating existing data

— Testing the relationship Testing the relationship (Jenkinson 1995)(Jenkinson 1995)

• Consider all of these opportunities within your business Consider all of these opportunities within your business processesprocesses

Page 36: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

And finally….And finally….• Know your dataKnow your data• Document your data strategyDocument your data strategy• Score the data based on level of confidenceScore the data based on level of confidence• Determine internal & external solutions Determine internal & external solutions • Create rules to apply at all data collection points – don’t forget Create rules to apply at all data collection points – don’t forget

your external data capture bureaux, partners and sales your external data capture bureaux, partners and sales channelschannels

• Regular review in the light of on-going data qualityRegular review in the light of on-going data quality• Learn from your experience Learn from your experience • Be prepared for change to processes and softwareBe prepared for change to processes and software• Achieve quality and maintain quality – get it right, keep it right!Achieve quality and maintain quality – get it right, keep it right!

Page 37: Data Quality

© Michael Collins 2001-2010. All rights reserved. www.dmcounsel.co.uk

Thank you

[email protected]


Recommended