+ All Categories
Home > Documents > Case Study Examples to Demonstrate the use of Samples of...

Case Study Examples to Demonstrate the use of Samples of...

Date post: 11-May-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
34
Case Study Examples to Demonstrate the use of Samples of Anonymised Records in Marketing Analysis BARRY LEVENTHAL BERRY CONSULTING Occasional Paper 5 UNIVERSITY OF MANCHESTER CENSUS MICRODATA UNIT 1994 Barry Leventhal Berry Consulting 2 Charterhouse Mews London, ECIM 6BB Case Study Examples to Demonstrate the use of Samples of Anonymised Records in Marketing Analysis BARRY LEVENTHAL BERRY CONSULTING Occasional Paper 5 UNIVERSITY OF MANCHESTER CENSUS MICRODATA UNIT 1994 Barry Leventhal Berry Consulting 2 Charterhouse Mews London, ECIM 6BB
Transcript
Page 1: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Case Study Examples to Demonstrate the use of

Samples of Anonymised Records in Marketing Analysis

BARRY LEVENTHAL BERRY CONSULTING

Occasional Paper 5

UNIVERSITY OF MANCHESTER CENSUS MICRODATA UNIT

1994

Barry Leventhal Berry Consulting

2 Charterhouse Mews London, ECIM 6BB

Case Study Examples to Demonstrate the use of

Samples of Anonymised Records in Marketing Analysis

BARRY LEVENTHAL BERRY CONSULTING

Occasional Paper 5

UNIVERSITY OF MANCHESTER CENSUS MICRODATA UNIT

1994

Barry Leventhal Berry Consulting

2 Charterhouse Mews London, ECIM 6BB

Page 2: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package
Page 3: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

<Cl CENSUS MICRODATA UNIT 1994 ISBN 1 899005 06 4

The census data in this paper is Crown Copyright, supplied courtesy of OPCS and GRO(s)

Printed in Great Britain by:

Census Microdata Unit Faculty of Economic & Social Studies The University of Manchester Oxford Road, Manchester M13 9PL.

<Cl CENSUS MICRODATA UNIT 1994 ISBN 1 899005 06 4

The census data in this paper is Crown Copyright, supplied courtesy of OPCS and GRO(s)

Printed in Great Britain by:

Census Microdata Unit Faculty of Economic & Social Studies The University of Manchester Oxford Road, Manchester M13 9PL.

Page 4: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Contents

1. Introduction

2. Potential Applications of Census Microdata in Marketing

3. Case Study 1 - Quantifying a Target Market

4. Case Study 2 - Demographic Modelling

5. Main Conclusions

Appendices

A: Presentations of Case Study Results

B: Note on tree segmentation technique

Contents

1. Introduction

2. Potential Applications of Census Microdata in Marketing

3. Case Study 1 - Quantifying a Target Market

4. Case Study 2 - Demographic Modelling

5. Main Conclusions

Appendices

A: Presentations of Case Study Results

B: Note on tree segmentation technique

Page 5: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

1. Introduction

In the Autumn of 1993 the Census Microdata Unit at Manchester University received two samples of Anonymised Records (SARs) from the 1991 Census. This was the first ever release of individual-level data from a population census in Great Britain, and was the outcome of lengthy discussions between the ESRC and the Census Offices.

The Census Microdata Unit (CMU) is responsible for disseminating information from the SARs to the community of census users, including commercial organisations under a special license agreement.

In order to demonstrate some of the potential applications in marketing, the CMU commissioned Berry Consulting (BCo) to produce two case study analyses using SAR data. This report describes the two case studies and presents the main results.

Both case studies were produced on a PC using standard statistical software.

2. Potential Applications of Census Microdata in Marketing

Marketing analysts and planners of all types need to know about the demographics of the population. Census microdata give us the most flexible possible source of demographic information and so have numerous potential applications.

First we shall consider the range of possibilities, before demonstrating these with two case studies.

Electronic Census Reports

Census microdata can be employed to provide an 'electronic alternative' to the printed census reports, and equally to obtain new reports that have never actually been published.

Conventional census results are provided in a long series of volumes which include topic reports, mainly at national or regional level, county monitors and county reports.

Despite this wealth of output, the potential analyst is liable to find these reports unwieldy to use. They may be inconvenient to access, may not provide exactly the required demographic relationships or level of detail, and results may need to be amalgamated across geographical areas.

1. Introduction

In the Autumn of 1993 the Census Microdata Unit at Manchester University received two samples of Anonymised Records (SARs) from the 1991 Census. This was the first ever release of individual-level data from a population census in Great Britain, and was the outcome of lengthy discussions between the ESRC and the Census Offices.

The Census Microdata Unit (CMU) is responsible for disseminating information from the SARs to the community of census users, including commercial organisations under a special license agreement.

In order to demonstrate some of the potential applications in marketing, the CMU commissioned Berry Consulting (BCo) to produce two case study analyses using SAR data. This report describes the two case studies and presents the main results.

Both case studies were produced on a PC using standard statistical software.

2. Potential Applications of Census Microdata in Marketing

Marketing analysts and planners of all types need to know about the demographics of the population. Census microdata give us the most flexible possible source of demographic information and so have numerous potential applications.

First we shall consider the range of possibilities, before demonstrating these with two case studies.

Electronic Census Reports

Census microdata can be employed to provide an 'electronic alternative' to the printed census reports, and equally to obtain new reports that have never actually been published.

Conventional census results are provided in a long series of volumes which include topic reports, mainly at national or regional level, county monitors and county reports.

Despite this wealth of output, the potential analyst is liable to find these reports unwieldy to use. They may be inconvenient to access, may not provide exactly the required demographic relationships or level of detail, and results may need to be amalgamated across geographical areas.

Page 6: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

The 'Electronic Census Report' (ECR) can be created simply by loading SAR data

into a user-friendly software package - such as one of the many survey analysis

systems or the 'USAR' program developed specifically for handling the SARs.

Armed with a desk-top ECR, the analyst could define and produce demographic

tables either nationally or for required areas (as aggregations of districts or

regions). The reports could then be passed onto other software for charting and

presen tation.

Market Planning

If the target market for a product can be defined demographically, census

microdata may be employed to obtain the market size and understand its

characteristics and regional dispersion.

Although conventional census output may be used in a similar way, it may not fit

the required target market definition exactly, meaning that approximations and

assumptions then have to be made. Since standard census reports are mainly two­

way cross-tabulations, they will probably contain insufficient detail to look at the

demographic characteristics of the target group.

This major application is demonstrated below, in Case Study 1.

Retail Site Planning

Knowledge of the population within a district is essential for planning locations of

retail developments. Census microdata permits in-depth analysis of population

structure and dynamics, and so could form a benchmark for site planning.

We must remember that the SARs do not identify small geographical areas, and so

the Census Local Base Statistics or Small Area Statistics will probably still be

required for catchment area analysis.

Market Share Evaluation

Organisations which hold information about their customers may use census

microdata in order to benchmark their market against the population, and so obtain

market share within each demographic group.

The 'Electronic Census Report' (ECR) can be created simply by loading SAR data

into a user-friendly software package - such as one of the many survey analysis

systems or the 'USAR' program developed specifically for handling the SARs.

Armed with a desk-top ECR, the analyst could define and produce demographic

tables either nationally or for required areas (as aggregations of districts or

regions). The reports could then be passed onto other software for charting and

presen tation.

Market Planning

If the target market for a product can be defined demographically, census

microdata may be employed to obtain the market size and understand its

characteristics and regional dispersion.

Although conventional census output may be used in a similar way, it may not fit

the required target market definition exactly, meaning that approximations and

assumptions then have to be made. Since standard census reports are mainly two­

way cross-tabulations, they will probably contain insufficient detail to look at the

demographic characteristics of the target group.

This major application is demonstrated below, in Case Study 1.

Retail Site Planning

Knowledge of the population within a district is essential for planning locations of

retail developments. Census microdata permits in-depth analysis of population

structure and dynamics, and so could form a benchmark for site planning.

We must remember that the SARs do not identify small geographical areas, and so

the Census Local Base Statistics or Small Area Statistics will probably still be

required for catchment area analysis.

Market Share Evaluation

Organisations which hold information about their customers may use census

microdata in order to benchmark their market against the population, and so obtain

market share within each demographic group.

Page 7: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Customer Segment Evaluation

Many organisations are adopting customer segmentation as the basis for marketing to their customers. A set of customer segments could be 'mapped onto' the SARs, so as to understand the overall size and distribution of these segments in the population. From this, market shares within segments could then be obtained.

For example, Lifestage is often a key segmentor in financial services markets. The SAR user can employ the Lifestage groups used in the Census, or define a new set of segments to match those being applied to customers.

Demographic Modelling

Because the SARs consist of individual records for persons and households, they may be used as a testbed to develop models and segmentation systems. For example, a classification of households could be constructed, which should be a more accurate indicator of household behaviour than the area discriminators such as ACORN and MOSAIC. Area discriminators all suffer from the 'ecological fallacy' which means that they may not accurately describe the individuals living in each area. In principle, the household classification could be applied to the main census database held by OPCS/GRO(S) in order to obtain the breakdown of household types at small area level.

Similarly, models could be built to predict 'hard to collect' variables from other more readily available attributes. Case Study 2 demonstrates this type of analysis.

3. Case Study 1 - Quantifying a Target Mal'ket

Introduction

When marketing any product an understanding of the size and characteristics of the target market is obviously vital.

Often, the target market is defined in demographic terms and so census information should be the most exact data source for measuring market size and penetration. However, published census results, such as the topic reports and local statistics may not give the required count. For example, they generally provide population counts cross analysed by two demographics at a time and age is often banded into 5 or 10 year ranges. Therefore, in employing such sources, approximation and guesswork are often required.

Using census microdata, target markets can be defined exactly as the required combinations of census demographics.

Customer Segment Evaluation

Many organisations are adopting customer segmentation as the basis for marketing to their customers. A set of customer segments could be 'mapped onto' the SARs, so as to understand the overall size and distribution of these segments in the population. From this, market shares within segments could then be obtained.

For example, Lifestage is often a key segmentor in financial services markets. The SAR user can employ the Lifestage groups used in the Census, or define a new set of segments to match those being applied to customers.

Demographic Modelling

Because the SARs consist of individual records for persons and households, they may be used as a testbed to develop models and segmentation systems. For example, a classification of households could be constructed, which should be a more accurate indicator of household behaviour than the area discriminators such as ACORN and MOSAIC. Area discriminators all suffer from the 'ecological fallacy' which means that they may not accurately describe the individuals living in each area. In principle, the household classification could be applied to the main census database held by OPCS/GRO(S) in order to obtain the breakdown of household types at small area level.

Similarly, models could be built to predict 'hard to collect' variables from other more readily available attributes. Case Study 2 demonstrates this type of analysis.

3. Case Study 1 - Quantifying a Target Mal'ket

Introduction

When marketing any product an understanding of the size and characteristics of the target market is obviously vital.

Often, the target market is defined in demographic terms and so census information should be the most exact data source for measuring market size and penetration. However, published census results, such as the topic reports and local statistics may not give the required count. For example, they generally provide population counts cross analysed by two demographics at a time and age is often banded into 5 or 10 year ranges. Therefore, in employing such sources, approximation and guesswork are often required.

Using census microdata, target markets can be defined exactly as the required combinations of census demographics.

Page 8: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

This is demonstrated in Case Study 1 for a hypothetical target market:

"Heads of household aged 55 + who are income earners and own their

homes outright. "

This target market would be of obvious importance to companies marketing certain

products and services, for example luxury goods and holidays.

In the Census, the Head of Household is taken as the first person on the schedule

who is 16 or over and not a visitor. This procedure differs from the approach

adopted in Market Research and we have not attempted to adjust for it in this case study. However, census microdata could be employed to identify an alternative

Head in each household, and quantify the potential error in the census definition.

In Case Study 1, the target market is quantified and profiled by other attributes in

order to help understand its characteristics.

Method

Using the Individuals SAR file, all records with age less than 16 were excluded.

The target market was defined as:

Heads of household

Aged 55+

Earners (employed or self-employed)

Owning home outright

The variables used from the SAR dataset were:

Relat (Relationship to

household head) =

Age (Age) ge

Ecposfhp (Economic position of family head) =

Tenure (Tenure of household space) =

1 (household head)

55 1 (in employment)

1 (owner occ - outright)

The target records were flagged and cross-tabulated against the rest of the sample

by a number of variables.

The cross-tabs gave the penetration of the target market within other attributes;

these values were then indexed against the national average.

This is demonstrated in Case Study 1 for a hypothetical target market:

"Heads of household aged 55 + who are income earners and own their

homes outright. "

This target market would be of obvious importance to companies marketing certain

products and services, for example luxury goods and holidays.

In the Census, the Head of Household is taken as the first person on the schedule

who is 16 or over and not a visitor. This procedure differs from the approach

adopted in Market Research and we have not attempted to adjust for it in this case study. However, census microdata could be employed to identify an alternative

Head in each household, and quantify the potential error in the census definition.

In Case Study 1, the target market is quantified and profiled by other attributes in

order to help understand its characteristics.

Method

Using the Individuals SAR file, all records with age less than 16 were excluded.

The target market was defined as:

Heads of household

Aged 55+

Earners (employed or self-employed)

Owning home outright

The variables used from the SAR dataset were:

Relat (Relationship to

household head) =

Age (Age) ge

Ecposfhp (Economic position of family head) =

Tenure (Tenure of household space) =

1 (household head)

55 1 (in employment)

1 (owner occ - outright)

The target records were flagged and cross-tabulated against the rest of the sample

by a number of variables.

The cross-tabs gave the penetration of the target market within other attributes;

these values were then indexed against the national average.

Page 9: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Results

A total of 13,088 individual records were identified as belonging to the target market, out of 894,115 persons aged 16+ present in the Individuals SAR. Therefore the penetration of the target market was 1.46%, amongst adults aged 16+.

The identified records were then profiled by regions and key demographics indicating that the target market is:

o most concentrated in East Midlands and East Anglia, and lowest in Inner London

o mainly Married or Remarried o weak among non-white ethnic groups o skewed towards professional self employed occupations and farmers o likely to possess amenities such as central heating and cars.

4. Case Study 2 - Demographic Modelling

Introduction

The purpose of the second case study is to demonstrate the ability to use census microdata for individual-level analysis, such as demographic modelling. This type of analysis cannot be undertaken with conventional census output which aggregates together individual results.

The objective of Case Study 2 is to model the propensity for households to belong to a particular target group in terms of other attributes or "predictors". Having developed such a model, it could then be employed to impute propensities on a separate dataset, for example a customer tile or a research survey.

The nominated target for this exercise was membership of Social Class I, a group of obvious marketing importance. The predictor attributes were a set of simpler census variables coded at "100 % level" in the 1991 Census.

A simple tree segmentation model was developed using the CHAID package for Chi-square automatic interaction detection. This method was chosen due to its clear visual presentation of the analysis results. However a number of other techniques could have alternatively been employed, including logistic regression and log-linear analysis.

Results

A total of 13,088 individual records were identified as belonging to the target market, out of 894,115 persons aged 16+ present in the Individuals SAR. Therefore the penetration of the target market was 1.46%, amongst adults aged 16+.

The identified records were then profiled by regions and key demographics indicating that the target market is:

o most concentrated in East Midlands and East Anglia, and lowest in Inner London

o mainly Married or Remarried o weak among non-white ethnic groups o skewed towards professional self employed occupations and farmers o likely to possess amenities such as central heating and cars.

4. Case Study 2 - Demographic Modelling

Introduction

The purpose of the second case study is to demonstrate the ability to use census microdata for individual-level analysis, such as demographic modelling. This type of analysis cannot be undertaken with conventional census output which aggregates together individual results.

The objective of Case Study 2 is to model the propensity for households to belong to a particular target group in terms of other attributes or "predictors". Having developed such a model, it could then be employed to impute propensities on a separate dataset, for example a customer tile or a research survey.

The nominated target for this exercise was membership of Social Class I, a group of obvious marketing importance. The predictor attributes were a set of simpler census variables coded at "100 % level" in the 1991 Census.

A simple tree segmentation model was developed using the CHAID package for Chi-square automatic interaction detection. This method was chosen due to its clear visual presentation of the analysis results. However a number of other techniques could have alternatively been employed, including logistic regression and log-linear analysis.

Page 10: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Method

The Household SAR file contains data both at household level and at person level.

For the household level file the following variables were included:

cars

tenure

roomsnum

cenheat

hhsptype

(Number of cars)

(Tenure type)

(Number of rooms)

(Availability of central heating)

(Household space type)

For the person level file, all records where relat (Relationship to household head)

= I (household head) were included and the following variables were retained:

age

soclass

(Age)

(Social class based on occupation)

These two files were then merged and each record was flagged as either being

Social Class I or not (ie. head of household in Social Class I).

A similar result could be achieved using the individual-level SAR file, by selecting

RELA T =0 to obtain heads of household with housing information attached.

A 1 in 20 sample was drawn for those not in the target group and combined with

all households in the target group giving a sample of 20,626 records.

A tree model using CHAID was then developed on one half of the sample,

validated on the other half and then re-run on the whole sample.

Results

The target group comprised 10,357 households in Social Class 1. Taking a 1 in 20

sample of other households gave 10,269 non-target households for comparison.

Therefore the target households accounted for 50.2 % of the analysis sample, but

their true penetration across all households was 4.8%.

The tree segmentation analysis identified that the most important predictors of

Social Class I (amongst 100% coded variables) are Number of cars, Tenure,

Household space type and Age of household head.

Method

The Household SAR file contains data both at household level and at person level.

For the household level file the following variables were included:

cars

tenure

roomsnum

cenheat

hhsptype

(Number of cars)

(Tenure type)

(Number of rooms)

(Availability of central heating)

(Household space type)

For the person level file, all records where relat (Relationship to household head)

= I (household head) were included and the following variables were retained:

age

soclass

(Age)

(Social class based on occupation)

These two files were then merged and each record was flagged as either being

Social Class I or not (ie. head of household in Social Class I).

A similar result could be achieved using the individual-level SAR file, by selecting

RELA T =0 to obtain heads of household with housing information attached.

A 1 in 20 sample was drawn for those not in the target group and combined with

all households in the target group giving a sample of 20,626 records.

A tree model using CHAID was then developed on one half of the sample,

validated on the other half and then re-run on the whole sample.

Results

The target group comprised 10,357 households in Social Class 1. Taking a 1 in 20

sample of other households gave 10,269 non-target households for comparison.

Therefore the target households accounted for 50.2 % of the analysis sample, but

their true penetration across all households was 4.8%.

The tree segmentation analysis identified that the most important predictors of

Social Class I (amongst 100% coded variables) are Number of cars, Tenure,

Household space type and Age of household head.

Page 11: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Two versions of the tree segmentation are presented in Appendix A:

a) the unweighted analysis, as obtained using CHArD

b) the corresponding results after weighting non-target households by a factor of 20, in order to produce realistic estimates for the penetration of Social Class I among all households

Therefore, from the weighted results we see that the tree analysis identified a series of sub-groups with Social Class I penetration ranging from 15.1 % down to 0.1%.

5. Main Conclusions

By producing these case studies we may conclude that:

a) The SARs are a rich and flexible source of data for analysing target markets or cllstomer segments.

b) SAR data enables the 1991 Census to be analysed in ways that could not be achieved from the conventional output products. Similar techniques may be employed to those commonly adopted elsewhere in market research and marketing.

c) Users will be able to manipulate SAR data with their own computers and software, therefore census analysis should be easier and more affordable than ever before.

sf'\c-study

Two versions of the tree segmentation are presented in Appendix A:

a) the unweighted analysis, as obtained using CHArD

b) the corresponding results after weighting non-target households by a factor of 20, in order to produce realistic estimates for the penetration of Social Class I among all households

Therefore, from the weighted results we see that the tree analysis identified a series of sub-groups with Social Class I penetration ranging from 15.1 % down to 0.1%.

5. Main Conclusions

By producing these case studies we may conclude that:

a) The SARs are a rich and flexible source of data for analysing target markets or cllstomer segments.

b) SAR data enables the 1991 Census to be analysed in ways that could not be achieved from the conventional output products. Similar techniques may be employed to those commonly adopted elsewhere in market research and marketing.

c) Users will be able to manipulate SAR data with their own computers and software, therefore census analysis should be easier and more affordable than ever before.

sf'\c-study

Page 12: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package
Page 13: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

APPENDICES

Appendix A : Presentation of Case Study Results

APPENDICES

Appendix A : Presentation of Case Study Results

Page 14: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package
Page 15: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Case Study 1

Target Market

o Head of household

o Aged 55+

o Earners (employed or self employed)

o Owning home outright

Base

o Individuals SAR aged 16+

Case Study 1

Target Market

o Head of household

o Aged 55+

o Earners (employed or self employed)

o Owning home outright

Base

o Individuals SAR aged 16+

Page 16: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Sample Sizes

Target Market: 13,088

Base: 894,115

GB Average Penetration: 1.46%

Sample Sizes

Target Market: 13,088

Base: 894,115

GB Average Penetration: 1.46%

Page 17: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

North

Yorks and Humbs

East Midlands

East Anglia

Inner London

Outer London

Rest of S. East

South West

West Midlands

NortWest

Wales

Scotland

100=GB Average

o

Penetration of Target Market SAR Regions

20 40 60 80

Index 100 120 140

North

Yorks and Humbs

East Midlands

East Anglia

Inner London

Outer London

Rest of S. East

South West

West Midlands

NortWest

Wales

Scotland

100=GB Average

o

Penetration of Target Market SAR Regions

20 40 60 80

Index 100 120 140

Page 18: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Single

Married

Remarried

Divorced

Widowed

o

1 OO=GB Average

Penetration of Target Market Marital Status

50 100

Index 150 200

Single

Married

Remarried

Divorced

Widowed

o

1 OO=GB Average

Penetration of Target Market Marital Status

50 100

Index 150 200

Page 19: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

White

Non-White

o

100=GB Average

Penetration of Target Market Ethnic Group

20 40 60

Index 80 100 120

White

Non-White

o

1 OO=GB Average

Penetration of Target Market Ethnic Group

20 40 60

Index 80 100 120

Page 20: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Professional

Manag Tech

N Skilled

M Skilled

Part Skilled

Unskilled

Armed Forces

Inad Described

Not Stated

o

1 OO=GB Average

Penetration of Target Market Social Class (Based on Occupation)

50 100

Index 150 200

Professional

Manag Tech

N Skilled

M Skilled

Part Skilled

Unskilled

Armed Forces

Inad Described

Not Stated

o

1 OO=GB Average

Penetration of Target Market Social Class (Based on Occupation)

50 100

Index 150 200

Page 21: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Emp Mang Large Employer Sml Managers Sml Prof Self Empl

Prof Employess Ancil Artist

Formn/Sup NM Junior NM

Personal Servc Foremn/Wker Man

Skilled Manual Semi-Ski! Man Unskilled Man Own Account

Farm-Emp Mang Farmer Own Ac

Agricultural Armed Forces

Inad Described

1 OO=GB Average

o

Penetration of Target Market Socio-Economic Group

100 200 300 400

Index 500 600 700

Emp Mang Large Employer Sml Managers Sml Prof Self Empl

Prof Employess Ancil Artist

Formn/Sup NM Junior NM

Personal Servc Foremn/Wker Man

Skilled Manual Semi-Skil Man Unskilled Man Own Account

Farm-Emp Mang Farmer Own Ac

Agricultural Armed Forces

Inad Described

1 OO=GB Average

o

Penetration of Target Market Socio-Economic Group

100 200 300 400

Index 500 600 700

Page 22: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

In All Rooms

In Some Rooms

No Central Heating

o

1 OO=GB Average

Penetration of Target Market Availability of Central Heating

20 40 60

Index 80 100 120

In All Rooms

In Some Rooms

No Central Heating

o

1 OO=GB Average

Penetration of Target Market Availability of Central Heating

20 40 60

Index 80 100 120

Page 23: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

o

1

2

3+

o 20

1 OO=GB Average

Penetration of Target Market Number of Cars

40 60 80 100

Index 120 140 160

o

1

2

3+

o 20

100=GB Average

Penetration of Target Market Number of Cars

40 60 80 100

Index 120 140 160

Page 24: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Case Study 2

Target Group

o Heads of households in Social Class I

Base

o Household SAR

Case Study 2

Target Group

o Heads of households in Social Class I

Base

o Household SAR

Page 25: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Analysis Technique

o Tree Segmentation (CHAID)

Candidate Predictors

o Number of cars

0 Tenure type

0 Number of rooms

0 Availability of central heating

0 Household space type

0 Age of household head

Analysis Technique

o Tree Segmentation (CHAID)

Candidate Predictors

o Number of cars

o Tenure type

o Number of rooms

o Availability of central heating

o Household space type

o Age of household head

Page 26: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Sample Sizes

Target Group:

1 in 20 sample of other households:

Penetration of Target Group within sample:

10,357

10,269

50.21 %

Sample Sizes

Target Group:

1 in 20 sample of other households:

Penetration of Target Group within sample:

10,357

10,269

50.21 %

Page 27: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

TREE SEGMENTATION ANALYSIS Target Group: Households in Social Class I

Unweighted

2 or more

7,270

68.2

Household space type

20,626

50.2

Number of cars

9,186

52.0

1

Tenure Owner oce. buying,

Detached, other

3,978

75.8

Tenure

Ownerocc.

outrighlf·h.:;re I

rented,o

Ownerocc. buying 16 - 34

3,023

78.1

955

68.5

1,086

66.8

Base: Household SAR

Semi-detached, terraced, resillential.flal

Owner occ. buying, private rented

3,292

59.1

Age

35 - 44

897

59.2

6,156

59.2

Househofll space type

Detached, 45+ resiantialflat, Semi, terraced

1,309

52.6

-~-'olhe,.

2,467

67.0

3,689

54.0

Age

969

54.6

25 - 44 116 - 25, 45+

2,461 111,228 58.8 44.4

Analysis based on all Social Class I Households and 1 in 20 sample of non-Social Class I

Ownocc. outright

2,190

45.6

Other

840

15.5

Household space type

Semi, terraced other

1,221

38.4

r private rented

1,325

34.0

906

8.3

o

4,170

14.9

Key: I Sample Size

% otC/ass / households

Tenure

Ownocc. outright Other

1,939

4.9

Age

16 -64

1,027

7.8

912

1.6

65+

TREE SEGMENTATION ANALYSIS Target Group: Households in Social Class I

Unweighted

2 or more

7,270 68.2

Household space type

20,626

50.2

Number of cars

9,186 52.0

1

Tenure

o

4,170 14.9

Key: Sample Size

% of Class I households

Tenure Owner occ. buying.

r-~~-.--~------. private

Detached, other

3,978

75.8

Tenure

Ownerocc. outright. priv te renteli.o,pH<.'-'----,

3,023

78.1

Ownerocc. buying 16 -34

955

68.5

,--'----,

1,086

66.8

Base: Household SAR

Semi-detached. terraced. resil.iential,flat

Owner occ. buying. private rented

Ownocc. outright Other rented

3,292

59.1

Age

35 -44

6,156

59.2

Household space type

Detached. 45+ resiuntialflat. Semi. terraced

2,190

45.6

840

15.5

Household space type

Detached Semi, terraced residential th

,---'-- other flll 0 er

897

59.2

1,309

52.6

2,467

67.0

3,689

54.0

Age

969

54.6

25 - 44 16 - 25. 45+

2,461 1,228

58.8 44.4

1,221

38.4

1,325

34.0

Analysis based on all Social Class I Households and 1 in 20 sample of non-Social Class I

Ownocc. outright

906

8.3

Other

1,939

4.9

Age

16-:64

1,027

7.8 .

912

1.6

65+

Page 28: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

TREE SEGMENTATION ANALYSIS Target Group: Households in Social Class I

Weighted

Detacheli, other

2 or more

51,200

9.7

Household space type

Semi-detached. terraced, resilientialjlal

.-----'.1 _-"

215,800

4.8

Number of cars

J

93,000

5.1

Tenure

Owner occ. buying. Private rented

Ownerocc. outright

,----'.--,

Ownerocc.

but.ng, private Other ren d

71,600

0.9

Key: I Sample Size % otC/ass I households

o

Tenure

Ownerocc. outright Other

22,300

13.5

28,900

6.7

53,900

6.8

24,800

4.0

14,300

0.9

17,900

2.5

16,700

0.4

37,000

0.3

Owner occ. Oll(right. private rented.lother

15,600

15.1

Tenure

Owner occ' l 16 _ 34 buying

6,700

9.8

7,900

9.1

Base: Household SAR

Age

35 - 44

7,900

6.8

Analysis grossed up to represent all SAR households

45+

13,100

5.3

Detached, res¥1ential jlat,

17,900

9.2

other

Household space type Household space type

Semi-detacheli, I Detached. I Semi-detached. terraced res!dentialjlat terraced. other

35,900

5.5

Age

9,300

5.7

25 - 44 I 16 - 24. 45+

15,500

3.0

21,700 1114,200 6.7 3.8

Age

16 - 64

19,000

0.4

18,000

0.1

65+

TREE SEGMENTATION ANALYSIS Target Group: Households in Social Class I

Weighted

Detacheli, other

2 or more

51,200

9.7

Household space type

Semi-detached, terraced, resiliential jlllf

,..----'---,

215,800

4.8

Number of cars

1

93,000

5.1

Tenllre

Owner occ. buying, Privllfe rented

Ownerocc. outright

,...----'-----,

Ownerocc. buy ng, private

Other ren ~d

Key: Sample Size

71,600

0.9

o

Tenure

Ownerocc. outright

% otClass I households

Other

22,300

13.5

28,900

6.7

53,900

6.8

24,800

4.0

14,300

0.9

17,900

2.5

16,700

0.4

37,000

0.3

Owner occ. Ol right, private rented, other

15,600

15.1

Tenure

Ownerocc. 16 - 34

buying

6,700

9.8

7,900

9.1

Base: Household SAR

Age

35 - 44

7,900

6.8

Analysis grossed up to represent all SAR households

45+

13,100

5.3

Household space type Household space type

Detached, Semi-detached, Detached, Semi-detached, res'rentialjlat, terraced res dentialflat terraced, other

17,900

9.2

other ,...----'-----, 35,900

5.5

Age

9,300

5.7

25 - 44 16 - 24, 45+

21,700 14,200

6.7 3.8

15,500

3.0

Age

16 - 64

19,000

0.4

18,000

0.1

65+

Page 29: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Appendix B : Tree Segmentation

Two main packages are available for producing tree segmentation analysis.

Automatic Interaction Detector (AID)

AID is a powerful, statistically appropriate tool useful in identifying population groups

(segments) that differ in their probability of an outcome such as response to mailing or purchase of a product.

AID is an explanatory tool. It gives some clear in sights into the underlying factors associated with differences in value of the outcome variable.

AID identifies interactions between the predictor variables ie. situations where the

outcome depends on a particular combination of predictors.

AID works by checking each predictor variable in turn and selecting both the predictor and cut-off point which segment the file most effectively for the dependent variable. This splits the file into two subgroups. Each of these subgroups is then tested, in turn, in order to find the predictor which best segments it further, giving four subgroups. The process continues until there are no further groups that can be split meaningfully.

AID provides a visual display of results in tree diagram form, where each mode represents a subgroup of the file and is accompanied by the sample size and average score

on the dependent variable.

Chi-Squared Automatic Interaction Detection (CHAID)

The techniques described above were designed for situations where the dependent and predictor values are quantitative. They may be extended to handle 0/1 variables, such as response or ownership, but are less suitable for categorial data.

CHAID is a variation upon the AID approach, which requires all variables to be categorical. CHAID also segments the sample in a stepwise manner, producing a tree analysis as the result. However, CHAID selects subgroups using a different criterion (a chi-squared measure as described above) and can generate multi-way splits, rather than only two-way.

Appendix B : Tree Segmentation

Two main packages are available for producing tree segmentation analysis.

Automatic Interaction Detector (AID)

AID is a powerful, statistically appropriate tool useful in identifying population groups

(segments) that differ in their probability of an outcome such as response to mailing or purchase of a product.

AID is an explanatory tool. It gives some clear in sights into the underlying factors associated with differences in value of the outcome variable.

AID identifies interactions between the predictor variables ie. situations where the

outcome depends on a particular combination of predictors.

AID works by checking each predictor variable in turn and selecting both the predictor and cut-off point which segment the file most effectively for the dependent variable. This splits the file into two subgroups. Each of these subgroups is then tested, in turn, in order to find the predictor which best segments it further, giving four subgroups. The process continues until there are no further groups that can be split meaningfully.

AID provides a visual display of results in tree diagram form, where each mode represents a subgroup of the file and is accompanied by the sample size and average score

on the dependent variable.

Chi-Squared Automatic Interaction Detection (CHAID)

The techniques described above were designed for situations where the dependent and predictor values are quantitative. They may be extended to handle 0/1 variables, such as response or ownership, but are less suitable for categorial data.

CHAID is a variation upon the AID approach, which requires all variables to be categorical. CHAID also segments the sample in a stepwise manner, producing a tree analysis as the result. However, CHAID selects subgroups using a different criterion (a chi-squared measure as described above) and can generate multi-way splits, rather than only two-way.

Page 30: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Other titles available from the Census Microdata Unit:

Census Microdata Unit Occasional Papers

No. 1 Problems of Imputation in the 1991 Census, (1 899005 02 1), Amarjit Sandhu

No. 2 Bias, Sampling Error and Coverage: the preliminary validation of the Samples of Anonymised Records from the 1991 Census, (1 899005 03 X), Steve Simpson, Bd Fieldhouse & Amarjit Sandhu

No.3 An Introductory Guide to Analysing the SARs, (1 89900504 8), Liz Middleton

No.4 Resource Allocation using measures of relative social needs in geographical areas: the relevance of the signed chi-squared, the percentage, and the raw count, (1 899005 05 6), Steve Simpson

SARs User Guide (1 899005 01 3)

Manchester Census Group

The Ethnic Dimensions of the 1991 Census: A Preliminary Report, (1 89900500 00 5), Roger Ballard & Virander Singh Kalra

Other titles available from the Census Microdata Unit:

Census Microdata Unit Occasional Papers

No. 1 Problems of Imputation in the 1991 Census, (1 899005 02 1), Amarjit Sandhu

No. 2 Bias, Sampling Error and Coverage: the preliminary validation of the Samples of Anonymised Records from the 1991 Census, (1 899005 03 X), Steve Simpson, Bd Fieldhouse & Amarjit Sandhu

No.3 An Introductory Guide to Analysing the SARs, (1 89900504 8), Liz Middleton

No.4 Resource Allocation using measures of relative social needs in geographical areas: the relevance of the signed chi-squared, the percentage, and the raw count, (1 899005 05 6), Steve Simpson

SARs User Guide (1 899005 01 3)

Manchester Census Group

The Ethnic Dimensions of the 1991 Census: A Preliminary Report, (1 89900500 00 5), Roger Ballard & Virander Singh Kalra

Page 31: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

£3.00

ISBN 1 899005 06 4

Further copies of this paper may be obtained from:

Census Microdata Unit Faculty of Economic & Social Studies The University of Manchester Oxford Road, Manchester M13 9PL

£3.00

ISBN 1 899005 06 4

Further copies of this paper may be obtained from:

Census Microdata Unit Faculty of Economic & Social Studies The University of Manchester Oxford Road, Manchester M13 9PL

Page 32: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package
Page 33: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package
Page 34: Case Study Examples to Demonstrate the use of Samples of ...hummedia.manchester.ac.uk/.../occasional-papers/...A simple tree segmentation model was developed using the CHAID package

Recommended