Post on 08-May-2015
description
transcript
Data Quality Doesn’t Just HappenAnd Here’s What Some of the Industry’s Most Influential Players are Doing About it
June 2013
1. Why is data quality an issue?
2. What are industry players doing about it?
3. Why was TrueSample created?
AGENDA
Back to Basics: Why does data quality matter?
“Online panels have stormed the market research industry, offering access to inexpensive samples quickly — but at the same time, firms report anxiety about the quality of the sample…”
Brad Bortner
Forrester
“Industry associations launch major initiatives to investigate and restore online research quality.”
Industry Associations
CASRO, AMA, ESOMAR, ARF
“P&G speaks out about online data quality issues at the Client Summit sparking industry-wide discourse”Kim Dedeker
P&G & Kantar
The Market Research Industry Has Been Struggling to Address Online Data Quality for
Years
Independent Research from The ARF Identifies 41% Email Address Overlap Across Panels
Panelist Duplication/Multi-Panel Membership
# of Panels Total # Panelists % of Total Panelist Validated Total Responses taken by Panelists in this section % of Responses1 15,747,937 78.23% 4,580,489 11.21%2 2,668,338 13.25% 7,313,709 17.90%3 897,742 4.46% 4,962,607 12.15%4 384,027 1.91% 3,842,079 9.40%5 186,938 0.93% 3,088,652 7.56%6 98,951 0.49% 2,658,842 6.51%7 55,989 0.28% 2,314,391 5.66%8 32,760 0.16% 2,014,186 4.93%9 20,324 0.10% 1,764,155 4.32%
10 13,369 0.07% 1,564,083 3.83%11 9,278 0.05% 1,415,787 3.47%12 6,231 0.03% 1,174,433 2.87%13 4,162 0.02% 2,278,468 5.58%14 2,474 0.01% 663,378 1.62%15 1,475 0.01% 692,292 1.69%16 763 0.00% 253,483 0.62%17 366 0.00% 139,181 0.34%18 159 0.00% 73,002 0.18%19 69 0.00% 30,923 0.08%20 37 0.00% 19,850 0.05%21 24 0.00% 9,373 0.02%22 9 0.00% 4,373 0.01%24 1 0.00% 0 0.00%25 1 0.00% 0 0.00%26 2 0.00% 0 0.00%29 1 0.00% 0 0.00%34 1 0.00% 0 0.00%36 1 0.00% 0 0.00%TOTAL 20,131,429 100.00% 40,857,736 100.00%
• 78% of submitted and validated panelists only belong to a single panel
• HOWEVER…..
• 50% of survey responses come from panelists that are a member of 5+ panels!
• Less than 1% of total panelists accounts for more than 15% of survey responses and they are a member of an average of 13 panels
First-Hand EvidenceProject: Technology A&U study
Goal: Compare clean/unclean sample
Results of unclean sample:
• Unrealistic segmentation solutions
• Higher mean scores and SD’s
• Degradation of sensitivity of significance tests
-From Steve Schwartz’ presentation at ‘09 IIR Market Research Event
Takeaway: Data from unclean sample would have led to different business decisions
Clients Are Able to Identify Analytical Issues with Data Quality in Online Research Projects
Top Tech Firm
First-Hand Evidence Project: Product launch in-home usage study (IHUT)
Goal: Test product against 3 discrete sample populations and ready for commercial product launch
Results:
• Lack of quality controls resulted in 50% of respondents receiving more than one product during the usage period
• Research Impact: All three studies had to be reviewed – key measures were undeterminable
• Business Impact: Estimated loss in revenue of $15 million due to delays not to mention tarnished reputation with retailers
Takeaway: Lack of quality controls/measures can causesignificant rework and expense
Clients Experience Operational Issues with Data Quality in Online Research Projects
Top CPG Firm
Quantifying the Risk of Bad Respondents
• Risk Ratio is defined as the ratio of the probability of getting a wrong answer to the baseline probability of 5%, based on sampling theory.
• Clients on average see 20% or more of respondents in their survey failing at least one quality check of TrueSample meaning that their risk of not applying TrueSample is doubled!
What Industry Players are Doing About it?
Data from Confirmit 2012 Annual MR Software Survey
• Penny for your thoughts – most online surveys today are incentivised– Nearly six out of 10 (57%) of research companies are using incentivised panels for
between two-thirds and 100% of their samples. Only a few (7%) are not using rewards at all.
• Independent panel verification is the exception not the norm– Around three-quarters (76%) of panel operators do not subscribe to independent panel
verification services. Even among large companies 58% do not do this.
• Most MR companies run simple fraud prevention checks on online responses
– Most companies are checking for speeding by respondents (73%) and nearly two- thirds (63%) look for ‘straightlining’: two quality control methods that many data collection tools make easy to apply.
• More thorough respondent fraud checks are largely shunned– Just over a half surveyed (52%) use challenge questions, and fewer still some of the
more high tech methods.
Clients Implement Standardized Quality Requirements – Suppliers Seek an Automated, Systematic Solution
to Comply
As More Clients Apply Standard Online Research Quality Requirements TrueSample Will Help Clients
Meet Them
FoQ2 is Counting on the TrueSample Quality Council
From the FOQ 2 analyses and insights: • The ARF and FoQ2 participants will produce important findings and deliver
new guidelines with strong recommendations over the next few months.• The ARF is counting on TrueSample and the TrueSample Quality Council to
help translate FoQ2 learning into advanced online research practice applications and Research-on-Research.
Companies Coming Together to Create an Industry Standard
Clients:Clients:
Sample Suppliers:Sample Suppliers:
Research Companies:Research Companies:
Survey Platforms:Technology Platforms:Federated
Why was TrueSample created?
Survey Design & Creation
Panel Management &
Selection
Data Collection
Analyze & Improve
SAMPLE
TrueSample: Provide a consistent and scalable data quality platform for online research
TrueSample: Help people seeking insights make better decisions
Through applying the best available, independent, and comprehensive data quality solution in every country where they conduct online quantitative research.
Through reducing the risk of making poor decisions as a result of applying TrueSample technology and algorithms to respondents and survey instruments to systematically and comprehensively eliminate "bad" data wherever possible.
5
Research-on-Research (RoR) has Been the Foundation of the TrueSample Quality Council
Past RoR Examples
Impact of Identify Verification on Hard-
to-Reach Groups
Impact of Chronically Unengaged
Respondents on Data Quality
Impact of Survey Design on Data
Quality
Impact of ‘Bad’ Respondents on
Business Decisions
Real Check Postal
SurveyScoreEngagement
Check
Real Check Social
“The goal of the TrueSample Research-on-Research Sub-Committee is to drive a research agenda that identifies and provides empirical evidence related to techniques that can be incorporated into the TrueSample product to maintain or enhance research data quality, an
important component in minimizing the risk of incorrect business decisions.”
Social Media& River Sample
Mobile DeviceData Collection
More Robust Analytics &
Question Types
TrueSample Quality Council RoR Sub-Committee
Prioritization Process
DESIGN FIELD ANALYZE
Results will inform TrueSample product roadmap
From RoR to Product Roadmap
AlternativeIdentity Validation
Variables
• Phase 1 = ROR efforts to ascertain WHAT challenges need to be solved for
• Phase 2 = ROR efforts to to ascertain HOW to solve for the challenges
› Panelists are who and where they say they are› Identity validation with reputable, third-party databases
Real
› Panelists unique within and across all Certified Panels› Machine fingerprinting ensures no duplicate survey takers
Unique
› No straight-lining respondents› No speeding respondents
Engaged
› Respondents meet exclusion criteria for survey› Respondents’ survey-taking behavior tracked over time
Qualified
› Predictive models improve survey design before launch› Actual survey engagement scored and benchmarked
SurveyScore
TrueSample is a Technology that Provides Consistent, Objective, and Automated Quality
Thank you!
TrueSample 2013 Initiatives
• Consistency scoring at the panel level to help aid in the selection of the most
appropriate panel for a particular study as well as proactively identify any
significant changes to a panel over time that may effect results in a study
• Consistency scoring at the individual panelist level to aid in the removal of
panelists that habitually provide responses that are inconsistentCONSISTENT
• Brings the full benefits of Real, Unique, Qualified, Engaged, Consistent, and
SurveyScore to mobile devices
• Specifically designed for app based research being conducted on smart
phone and or tablet based devices (iOS, Android, etc.)MOBILE
• Extends Panelist Validation functionality around Real and Unique to sample
sources utilizing a river sampling methodology
• Optimizes record submission process and functionality for suppliers utilizing
a river sampling methodologyRIVER SAMPLE
PROJECT PROJECT DESCRIPTION EXPECTED COMPLETION PROJECT LEADER
TrueSample Consistent
- Identify consistency of an individual panelist- Identify consistency of a panel
Phase 1-June 2013Phase 2-October
2013TrueSample
Engagement Algorithm Redesign
- Replace the current parametric approach with non-parametric clustering-based algorithm September 2013 TrueSample
Mobile Surveys- Compare and contrast user’s browsing patterns for mobile vs. desktop based surveys. - Understand the impact of a shorter survey, grid effect, straight lining, speeding, etc.
October 2013 TrueSample
Dynamic/River Sample- Optimize real-time panelist validation process- Understand impact of including this sample type in surveys
Phase 1-April 2013Phase 2-August
2013TrueSample
Global Validation- identifying different cultures – are data issues the same? Should validations/algorithms be different based on culture? Risk analysis in non-US country. Is engaged check different?
Fall TSQCMktg Inc/Research
Now
Operationalize SurveyScore What does a change in SurveyScore really mean? Fall TSQC Kantar
2013 RoR Roadmap
Survey Validation Evaluates Respondents in Real-Time as They Complete Surveys
Name/Address Form*
Respondent recognized as
Real?
Page 0 Page 1
Yes
No
Page 3+ Last Page
RespondentReal?
Unique?Engaged?Qualified?
* Form can be enabled on a per-survey basis.
Collect Page & Question Data
Store validation status for reporting
End Page Store validation status and
SurveyScore for reporting
http://SURVEYURL?source-id=22345&respondent-id=772822
RespondentReal?
Unique?Engaged?Qualified?
Create Digital
Fingerprint