Post on 10-Feb-2016
description
transcript
Quality Data – An Improbable Dream?
Quality DataAn Improbable Dream?
Elizabeth VannanCentre for Education Information
Victoria, BC, Canada
Quality Data – An Improbable Dream?
Information quality is a journey, not a destination
- Larry P. English
Quality Data – An Improbable Dream?
Agenda
• Data Definitions and Standards Project• What is Quality Data?• The Cost of Poor-Quality Data• Improving Data Quality – Our Process• Questions?
Quality Data – An Improbable Dream?
BC Higher Education• Canada’s Western-most
province
• Population: 4.023 Million
• Land Area: 366,795 Sq Miles
• Publicly Funded Post-Secondary System– 22 Colleges– 6 Universities
Quality Data – An Improbable Dream?
CEISS
The Centre for Education Information is an independent organization that provides research and technology
services to improve the performance of the BC education system
Quality Data – An Improbable Dream?
CEISS
• Implement and manage administrative systems
• Perform custom surveys, research and analysis
• Facilitate development and implementation of data standards
• Negotiate and manage province wide software contracts (Oracle, SCT Banner, Datatel)
Quality Data – An Improbable Dream?
DDEF Project
The Problem– Better data about the BC higher education
sector needed for decision-making– No infrastructure in place to facilitate the
collection of data electronically
Data Definitions and Standards ProjectInitiated in 1995
Quality Data – An Improbable Dream?
DDEF Project
The Solution– Create data standards for all higher
education information (Student, HR, Finance)
– Develop a data warehouse based on standards for reporting
– Implement a common technical infrastructure at all higher education institutions
Quality Data – An Improbable Dream?
DDEF Project
Project Goals– Improve the quantity and QUALITY of data
available– Reduce the number of data and reporting
requests – Develop business information system to
support the management and evaluation of the BC Post-Secondary system
Quality Data – An Improbable Dream?
How Are We Doing?
• 16 institutions implemented/implementing
• Institutions using data warehouses for internal reporting
• Data requests reduced• Ministry using data
Quality Data – An Improbable Dream?
Why Focus on Data Quality?
• Poor data quality in our data warehouse impacts:–Confidence–Decision making–Funding
Quality Data – An Improbable Dream?
Quality Data Are…
The Four Attributes of Data Quality
Quality Data – An Improbable Dream?
Quality Data Are…
• Accurate– Free from
errors– Representative
Quality Data – An Improbable Dream?
Quality Data Are…
• Complete– All values are
present
Quality Data – An Improbable Dream?
Quality Data Are…
• Timely– Recorded
immediately– Available when
required
Quality Data – An Improbable Dream?
Quality Data Are…
• Flexible– Data
definitions understood
– Can be used for multiple purposes
Quality Data – An Improbable Dream?
Quality Data…
• Don’t have to be perfect• Good enough to fill the business
need at a price you’re willing to pay
Our ChallengeDefining Quality Criteria for
Higher Education Data
Quality Data – An Improbable Dream?
Cost of Poor-Quality Data
• Business Process Costs
Incorrect RegistrationsInaccurate Tuition Billings
Payroll Errors
Quality Data – An Improbable Dream?
Cost of Poor-Quality Data
• Rework
Re-collect DataCorrect Errors
Data Verification
Quality Data – An Improbable Dream?
Cost of Poor-Quality Data
• Missed Opportunities
Substandard Customer ServicePoor Decision Making
Loss of Reputation
Quality Data – An Improbable Dream?
Improving Data Quality
Business Process Review
Improved Data
Quality
Data Quality Assessment
Business Practice Change
Data Cleansing
Quality Data – An Improbable Dream?
Business Process Review
• When, where, how is data collected?
• Where is data stored?• Who creates data?• Who uses data?• What outputs are required?• What quality checks already exist?
Quality Data – An Improbable Dream?
Business Process Review
• Involve all stakeholders!–For student data we involve
• Executive• Registrars office• IT Department• Institutional Research
Quality Data – An Improbable Dream?
Business Process Review
• Results–Understanding of business
practices–Identification of data creators,
custodians, users–Preliminary quality metrics–Problem business practices
Quality Data – An Improbable Dream?
Data Quality Assessment
• Establish Metrics• Apply metrics to data• Review results
Quality Data – An Improbable Dream?
Establish Metrics
• For each element determine quality criteria–Acceptable range of values–Acceptable syntax–Comparison to known values–Business rules–Thresholds
Quality Data – An Improbable Dream?
Quality Metrics
Quality Data – An Improbable Dream?
Applying Metrics
• Collect known information for comparison
• Develop queries to test each of your validation criteria– We use Oracle Discoverer, but other tools
exist (MS Access, SQL)
Quality Data – An Improbable Dream?
Applying Metrics
Test 1PEN must be 9 digits long. No characters, no shorter
values acceptable
Quality Data – An Improbable Dream?
Test 1 Results
Two Student Records Contain Invalid PEN Numbers
Quality Data – An Improbable Dream?
Test 1 Results
Invalid PEN’s
Data Entry Error?
Can Identify specific students for data cleansing
Quality Data – An Improbable Dream?
Applying Metrics
Test 2At least 80% of student
records must have valid PEN number
Quality Data – An Improbable Dream?
Test 2 Results
This Institution Meets the Quality Threshold
Quality Data – An Improbable Dream?
Applying Metrics
Test 3No Duplicate PEN’s
Quality Data – An Improbable Dream?
Test 3 Results
This institution has a BIG problem!
Can we see more details?
Quality Data – An Improbable Dream?
Test 3 Results
Addition information reveals data
loading problems
Quality Data – An Improbable Dream?
Reviewing Results
• Systematic approach needed• Develop strategy for data cleaning• Identify source of data problems
Deal with Disparate Data Shock!
Quality Data – An Improbable Dream?
Reviewing Results
• Insert a quality review checklist
Quality Data – An Improbable Dream?
Reviewing Results
Quality Data – An Improbable Dream?
Data Cleansing
• Location– Administrative System?– Staging Area?
• Who• Scope
Quality Data – An Improbable Dream?
Typical Data Cleansing
• Correcting data entry errors• Removing or correcting nonsensical
dates• Deleting “garbage” records• Combining or deleting duplicates• Updating and applying code sets
Quality Data – An Improbable Dream?
Business Practice Change
• Two components– Implementing changes to improve data
quality– Adopting ongoing data quality review
process
Changing Business Practices is a ChallengeGet Stakeholder Support
Quality Data – An Improbable Dream?
Business Practice Change
• Education• Centralizing responsibility for codes• Consolidating data collection• Implementing validation routines• Change business processes
Quality Data – An Improbable Dream?
Quality Review Process
• Review data regularly• Make someone responsible• Establish procedures for correcting data
problems• Communicate quality improvements
Quality Data – An Improbable Dream?
Some Changes in BC
• Creation of Data Manager position, responsible for code sets, data quality
• Regular education for registration clerks and other data creators
• Established relationships between data creators and users
• Re-engineered administrative systems
Quality Data – An Improbable Dream?
Improvements to BC Data
• Improved data quality and quantity– Nonsensical dates almost eliminated– Completeness of key elements improved
(from 50% to 80-90%)– Data now being collected for CE in
standard format
Quality Data – An Improbable Dream?
Final Thoughts…
• Quality Data are Probable if you are willing to…– Take a critical look at your existing data– Implement changes to how you collect and
manage data– Invest the time to educate and
communicate with data users and creators– Make data quality improvement an on-
going process
Quality Data – An Improbable Dream?
Recommended Reading
•Brackett, Michael H., Data Resource Quality, Turning Bad Habits into Good Practices (New York:Addison-Wesley, 2000)
•English, Larry P., Improving Data Warehouse and Business Information Quality (New York: John Wiley and Sons, 1999)
•Redman, Thomas C., Data Quality for the Information Age (Boston;Artech House, Inc., 1996)
Quality Data – An Improbable Dream?
Thank You!
Presentation Available Atwww.ceiss.org
orevannan@ceiss.org