Date post: | 18-Dec-2014 |
Category: |
Technology |
Upload: | stefan-urbanek |
View: | 205 times |
Download: | 1 times |
Data Quality Perception
data brewery
Dallas Data Brewery, June 2013
Topic■ What is "high quality data"?
■ What are data quality expectations?you, people or businesses you know have
■ Business issues and data qualityHow to you deal with it?
■ What happens when you ignore it?
What is data quality ?
Dimensions■ completeness – data provided
■ accuracy – reflecting real world
■ credibility – regarded as true
■ timeliness – up-to-date
■ consistency – matching facts across datasets
■ integrity – valid references between datasets
... and there are more
Fallacies
■ “good data are error-free and valid”
■ “improving quality means cleansing”
■ “it is IT problem”
■ “it can be fixed”
Short Story:Completeness
Open Public Procurements
from this...
... to this:
http://tendre.sme.sk
0%
25%
50%
75%
100%
2005
-320
05-5
2005
-720
05-9
2005
-11
2006
-120
06-3
2006
-520
06-7
2006
-920
06-1
120
07-1
2007
-320
07-5
2007
-720
07-9
2007
-11
2008
-120
08-3
2008
-520
08-7
2008
-920
08-1
120
09-1
2009
-320
09-5
2009
-720
09-9
2009
-11
2010
-120
10-3
2010
-520
10-7
2010
-9
better
have it all
none
Quality measure
completeness: 55%
how many % of the field is filled and successfully processed?
type 1 type 2
+
how many % of the field is filled and successfully processed?
0%
25%
50%
75%
100%
2005-3
2005-5
2005-7
2005-9
2005-10
2005-12
2006-3
2006-5
2006-7
2006-9
2006-11
2007-1
2007-3
2007-5
2007-7
2007-9
2007-10
2007-12
2008-3
2008-5
2008-7
2008-9
2008-11
2009-1
2009-3
2009-5
2009-7
2009-9
2009-11
2010-1
2010-3
2010-5
2010-7
2010-9
Quality measure
completeness: 88%
better
have it all
none
What does that mean:
“high quality data?”
?
85% ?
Conclusion
appropriate for given purpose
Data Project
■ define data quality requirements
■ measure during development
■ provide data quality report
More topics
■ Data quality measurementindicators, probes
■ Data quality managementroles, processes, impact
■ Data cleansing