Date post: | 09-May-2015 |
Category: |
Data & Analytics |
Upload: | alex-lomizov |
View: | 65 times |
Download: | 0 times |
Proved.co Scores and Metrics
January 2014
Introduction
For each concept proved.co calculates: — Score to show overall concept performance
— Five key metrics to show concept’s strengths, weaknesses and areas for improvement
This document outlines proved.co approach to score & metrics calculation, as well as background and framework behind.
2
Architecture
3
Questionnaire
Survey distribution DB
Dashboard
Analytics DB
Collection
Computation
Fro
nt-
en
d
Bac
k-e
nd
Visualization
Computation
4
Weighting
RIM-procedure
Individual weights for each respondent to fit age and gender proportions with census data
Calculation
Raw scores, i.e. direct results of stat formulae application on questionnaire data collected
Normalization
Normalized scores, i.e. raw scored rescaled to 0..100 using benchmarks
Proved Scores & Metrics
Weighting
5
Weighting
— Two target variables: — Gender: males and females (2 targets) — Age: 18-29, 30-49, 50+ (4 targets)
— Random Iterative Method (RIM): — Data is weighted by gender. Gender weighting factors are
calculated and applied (Iteration 1) — Data is weighted by age. Age weighting factors are
multiplied by gender’s (Iteration 1’s) and applied (Iteration 2) — Data is re-weighted by gender. Gender weighting factors
are multiplied by iteration 2’s and applied (Iteration 3). — Iterations continue until all targets are met or precision
does not change more that by 1% while weight factors are within [0.25; 4] limits.
6
Weighting targets
7
Males 49%
Females 52%
18-29YO 17%
30-49YO 38%
50+ YO 47%
Males 49%
Females 52%
18-29YO 23%
30-49YO 36%
50+ YO 43%
United Kingdom USA
Based on 2012 census data Based on 2011 census data
Special notes
Weighting does not apply for:
— Self-service plans, i.e. samples from client’s contact lists and river samples
— Audiences which target outside age and gender, i.e. moms or car owners
8
Calculation
9
Framework
— Proved.co is a project of Bojole (UK) Ltd, a traditional market research company with eight years of concept testing expierence
— At the moment of proved.co development Bojole had norms for 1228 concept tests: — Raw data, i.e. more than 250 thousands of
completed questionnaires
— Corroboration data, i.e. post-tests, ranking data, and instrumental variables
10
Corroboration
Post%tests'
Ranking'data'
Instrumental'variables'
11
Market data for launched concepts Available for limited number of concepts The most reliable corroboration
Max-diff ratings for sets of 30-90 concepts Available for 1116 concepts Quite reliable corroboration
Overall liking, purchase intent, etc Available for all 1228 concepts Questionable reliability
Score modeling
— Bojole has decided to develop a single score which best represents overall performance of a concept under test
— Bojole used iterative regression modeling to determine: — Variables to include into score calculation
— Score formulae
12
Corroboration variables
Relevance
Uniqueness Word of mouth
…
All available scaled diagnostic variables
Score modeling
— The following set of variables and formulae coefficients have been determined:
13
Concept relevance High impact
Concept’s word of mouth Mid impact
Concept’s value for money Mid impact
Concept uniqueness Low impact
Raw score calculations
On individual level
— Sum of weighted — Concept relevance,
— Word of mouth,
— Value for money
— Uniqueness
— Weighting coefficients reflect score modeling described above
On aggregate level
— Weighted average of individual scores
— Weights reflect fit to age/gender proportions of target population
14
Normalization
15
Framework
— We believe any concept test results to be useful only in context, i.e. against benchmarks
— Thus, we normalize raw score and each raw metric to the scale 0..100 representing its performance against benchmarks
— A little extra benefit — 0..100 scores are easier to read and compare between
16
Benchmarks
— We store for each idea: — Description and sample
— All calculated raw metrics and scores, i.e. raw benchmarks
— All normalized metrics and scores, i.e. scaled benchmarks
— List is updated with each new computation
17
Benchmarks for idea score
18
Distribution of raw idea scores is close to normal one and thus can be used for sensible 0..100 scaling
Normalization
19
— First, we calculate average (avg) and standard deviation (sdev) for a distribution of all raw benchmarks
— Then we calculate normalized value, measuring deviation of raw score / metric (rvalue) against average (avg) in standard deviations (sdev).
— Normalized score shows how given concept (or its metric) benchmarks against whole distribution of other concepts in our database.