Data Quality Doesn’t Just Happen: And Here’s What Some of the Industry’s Most Influential...

Post on 08-May-2015

158 views 1 download


Data quality isn’t always the sexiest topic, but it’s critical and one that buyers and suppliers often neglect to have. The ramifications of ignoring it can cost millions of dollars. Some of the industry’s largest buyers and suppliers have found a simple solution though and it’s one that is available to everyone else too. Come here about how the issue of data quality concerns haven’t gone away, and what others are doing to make sure they and their insights are protected.


Data Quality Doesn’t Just HappenAnd Here’s What Some of the Industry’s Most Influential Players are Doing About it

June 2013

1. Why is data quality an issue?

2. What are industry players doing about it?

3. Why was TrueSample created?


Back to Basics: Why does data quality matter?

“Online panels have stormed the market research industry, offering access to inexpensive samples quickly — but at the same time, firms report anxiety about the quality of the sample…”

Brad Bortner


“Industry associations launch major initiatives to investigate and restore online research quality.”

Industry Associations


“P&G speaks out about online data quality issues at the Client Summit sparking industry-wide discourse”Kim Dedeker

P&G & Kantar

The Market Research Industry Has Been Struggling to Address Online Data Quality for


Independent Research from The ARF Identifies 41% Email Address Overlap Across Panels

Panelist Duplication/Multi-Panel Membership

# of Panels Total # Panelists % of Total Panelist Validated Total Responses taken by Panelists in this section % of Responses1 15,747,937 78.23% 4,580,489 11.21%2 2,668,338 13.25% 7,313,709 17.90%3 897,742 4.46% 4,962,607 12.15%4 384,027 1.91% 3,842,079 9.40%5 186,938 0.93% 3,088,652 7.56%6 98,951 0.49% 2,658,842 6.51%7 55,989 0.28% 2,314,391 5.66%8 32,760 0.16% 2,014,186 4.93%9 20,324 0.10% 1,764,155 4.32%

10 13,369 0.07% 1,564,083 3.83%11 9,278 0.05% 1,415,787 3.47%12 6,231 0.03% 1,174,433 2.87%13 4,162 0.02% 2,278,468 5.58%14 2,474 0.01% 663,378 1.62%15 1,475 0.01% 692,292 1.69%16 763 0.00% 253,483 0.62%17 366 0.00% 139,181 0.34%18 159 0.00% 73,002 0.18%19 69 0.00% 30,923 0.08%20 37 0.00% 19,850 0.05%21 24 0.00% 9,373 0.02%22 9 0.00% 4,373 0.01%24 1 0.00% 0 0.00%25 1 0.00% 0 0.00%26 2 0.00% 0 0.00%29 1 0.00% 0 0.00%34 1 0.00% 0 0.00%36 1 0.00% 0 0.00%TOTAL 20,131,429 100.00% 40,857,736 100.00%

• 78% of submitted and validated panelists only belong to a single panel


• 50% of survey responses come from panelists that are a member of 5+ panels!

• Less than 1% of total panelists accounts for more than 15% of survey responses and they are a member of an average of 13 panels

First-Hand EvidenceProject: Technology A&U study

Goal: Compare clean/unclean sample

Results of unclean sample:

• Unrealistic segmentation solutions

• Higher mean scores and SD’s

• Degradation of sensitivity of significance tests

-From Steve Schwartz’ presentation at ‘09 IIR Market Research Event

Takeaway: Data from unclean sample would have led to different business decisions

Clients Are Able to Identify Analytical Issues with Data Quality in Online Research Projects

Top Tech Firm

First-Hand Evidence Project: Product launch in-home usage study (IHUT)

Goal: Test product against 3 discrete sample populations and ready for commercial product launch


• Lack of quality controls resulted in 50% of respondents receiving more than one product during the usage period

• Research Impact: All three studies had to be reviewed – key measures were undeterminable

• Business Impact: Estimated loss in revenue of $15 million due to delays not to mention tarnished reputation with retailers

Takeaway: Lack of quality controls/measures can causesignificant rework and expense

Clients Experience Operational Issues with Data Quality in Online Research Projects

Top CPG Firm

Quantifying the Risk of Bad Respondents

• Risk Ratio is defined as the ratio of the probability of getting a wrong answer to the baseline probability of 5%, based on sampling theory.

• Clients on average see 20% or more of respondents in their survey failing at least one quality check of TrueSample meaning that their risk of not applying TrueSample is doubled!

What Industry Players are Doing About it?

Data from Confirmit 2012 Annual MR Software Survey

• Penny for your thoughts – most online surveys today are incentivised– Nearly six out of 10 (57%) of research companies are using incentivised panels for

between two-thirds and 100% of their samples. Only a few (7%) are not using rewards at all.

• Independent panel verification is the exception not the norm– Around three-quarters (76%) of panel operators do not subscribe to independent panel

verification services. Even among large companies 58% do not do this.

• Most MR companies run simple fraud prevention checks on online responses

– Most companies are checking for speeding by respondents (73%) and nearly two- thirds (63%) look for ‘straightlining’: two quality control methods that many data collection tools make easy to apply.

• More thorough respondent fraud checks are largely shunned– Just over a half surveyed (52%) use challenge questions, and fewer still some of the

more high tech methods.

Clients Implement Standardized Quality Requirements – Suppliers Seek an Automated, Systematic Solution

to Comply

As More Clients Apply Standard Online Research Quality Requirements TrueSample Will Help Clients

Meet Them

FoQ2 is Counting on the TrueSample Quality Council

From the FOQ 2 analyses and insights: • The ARF and FoQ2 participants will produce important findings and deliver

new guidelines with strong recommendations over the next few months.• The ARF is counting on TrueSample and the TrueSample Quality Council to

help translate FoQ2 learning into advanced online research practice applications and Research-on-Research.

Companies Coming Together to Create an Industry Standard


Sample Suppliers:Sample Suppliers:

Research Companies:Research Companies:

Survey Platforms:Technology Platforms:Federated

Why was TrueSample created?

Survey Design & Creation

Panel Management &


Data Collection

Analyze & Improve


TrueSample: Provide a consistent and scalable data quality platform for online research

TrueSample: Help people seeking insights make better decisions

Through applying the best available, independent, and comprehensive data quality solution in every country where they conduct online quantitative research.

Through reducing the risk of making poor decisions as a result of applying TrueSample technology and algorithms to respondents and survey instruments to systematically and comprehensively eliminate "bad" data wherever possible.


Research-on-Research (RoR) has Been the Foundation of the TrueSample Quality Council

Past RoR Examples

Impact of Identify Verification on Hard-

to-Reach Groups

Impact of Chronically Unengaged

Respondents on Data Quality

Impact of Survey Design on Data


Impact of ‘Bad’ Respondents on

Business Decisions

Real Check Postal



Real Check Social

“The goal of the TrueSample Research-on-Research Sub-Committee is to drive a research agenda that identifies and provides empirical evidence related to techniques that can be incorporated into the TrueSample product to maintain or enhance research data quality, an

important component in minimizing the risk of incorrect business decisions.”

Social Media& River Sample

Mobile DeviceData Collection

More Robust Analytics &

Question Types

TrueSample Quality Council RoR Sub-Committee

Prioritization Process


Results will inform TrueSample product roadmap

From RoR to Product Roadmap

AlternativeIdentity Validation


• Phase 1 = ROR efforts to ascertain WHAT challenges need to be solved for

• Phase 2 = ROR efforts to to ascertain HOW to solve for the challenges

› Panelists are who and where they say they are› Identity validation with reputable, third-party databases


› Panelists unique within and across all Certified Panels› Machine fingerprinting ensures no duplicate survey takers


› No straight-lining respondents› No speeding respondents


› Respondents meet exclusion criteria for survey› Respondents’ survey-taking behavior tracked over time


› Predictive models improve survey design before launch› Actual survey engagement scored and benchmarked


TrueSample is a Technology that Provides Consistent, Objective, and Automated Quality

Thank you!

TrueSample 2013 Initiatives

• Consistency scoring at the panel level to help aid in the selection of the most

appropriate panel for a particular study as well as proactively identify any

significant changes to a panel over time that may effect results in a study

• Consistency scoring at the individual panelist level to aid in the removal of

panelists that habitually provide responses that are inconsistentCONSISTENT

• Brings the full benefits of Real, Unique, Qualified, Engaged, Consistent, and

SurveyScore to mobile devices

• Specifically designed for app based research being conducted on smart

phone and or tablet based devices (iOS, Android, etc.)MOBILE

• Extends Panelist Validation functionality around Real and Unique to sample

sources utilizing a river sampling methodology

• Optimizes record submission process and functionality for suppliers utilizing

a river sampling methodologyRIVER SAMPLE


TrueSample Consistent

- Identify consistency of an individual panelist- Identify consistency of a panel

Phase 1-June 2013Phase 2-October


Engagement Algorithm Redesign

- Replace the current parametric approach with non-parametric clustering-based algorithm September 2013 TrueSample

Mobile Surveys- Compare and contrast user’s browsing patterns for mobile vs. desktop based surveys. - Understand the impact of a shorter survey, grid effect, straight lining, speeding, etc.

October 2013 TrueSample

Dynamic/River Sample- Optimize real-time panelist validation process- Understand impact of including this sample type in surveys

Phase 1-April 2013Phase 2-August


Global Validation- identifying different cultures – are data issues the same? Should validations/algorithms be different based on culture? Risk analysis in non-US country. Is engaged check different?

Fall TSQCMktg Inc/Research


Operationalize SurveyScore What does a change in SurveyScore really mean? Fall TSQC Kantar

2013 RoR Roadmap

Survey Validation Evaluates Respondents in Real-Time as They Complete Surveys

Name/Address Form*

Respondent recognized as


Page 0 Page 1



Page 3+ Last Page



* Form can be enabled on a per-survey basis.

Collect Page & Question Data

Store validation status for reporting

End Page Store validation status and

SurveyScore for reporting




Create Digital
