+ All Categories
Home > Documents > Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Date post: 22-Feb-2016
Category:
Upload: stesha
View: 28 times
Download: 0 times
Share this document with a friend
Description:
Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton Beyond 2011 Programme Director Office for National Statistics. Outline. Background to the Census The Beyond 2011 Programme Statistical options for the future Key mathematical challenges Timeframes - PowerPoint PPT Presentation
34
Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton Beyond 2011 Programme Director Office for National Statistics
Transcript
Page 1: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Beyond 2011 The future for population statistics?IMA Mathematics 2012

Pete BentonBeyond 2011 Programme DirectorOffice for National Statistics

Page 2: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Outline

• Background to the Census• The Beyond 2011 Programme • Statistical options for the future• Key mathematical challenges• Timeframes• Next steps

Page 3: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

The purpose of the census• The basis for national decision

making:Service planning

• where to locate schools, hospitals, etc.• housing plans• transport

Resource allocation • health and local govt • £100bn each per year

Policy making and monitoring• Equality – age, sex, ethnicity, disability• Ageing population – pensions etc

Academic and social research

Page 4: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Key Census outputs

• Benchmark statistics on:Population units:

• people and housing• with key demographics (age, sex,

ethnicity) Population structures:

• households, familiesPopulation and housing attributes

• For small areas and small population groups

• With multivariate analysis• Consistent and comparable

Page 5: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

The 2011 Census

• Very successful- 94% response overall- Over 90% across London overall- Over 80% response in every Local Authority

• Significant improvement in key Local Authorities• The result of extensive mathematical modelling

- Response targets to achieve required output quality- Predicted initial response from key groups / areas- Numbers of field staff required to reach final targets- Daily live response rate modelling to support

operational decisions

Page 6: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

The Beyond 2011 Programme

•Why change? – Why look beyond 2011?Rapidly changing society

Evolving user requirementsNew opportunities – data sharing

Traditional census – costly and infrequent??

• UK Statistics Authority to Minister for Cabinet Office

“As a Board we have been concerned about the increasing costs and difficulties of traditional Census-taking. We have therefore already instructed the ONS to work urgently on the alternatives, with the intention that the 2011 Census will be the last of its kind.”

Page 7: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Beyond 2011 : Statistical options

Aggregate analysis

100% linkage to create ‘statistical population spine’

(Intermediate) Sample linkage e.g. 1% of postcodes

Address register + Survey

Administrativedata options

Traditional Census (long form to everyone)

Rolling Census (over 5/10 year period)

Short Form (everyone), Long form (Sample)

Short Form + Annual Survey (US model)

Censusoptions

Surveyoption(s)

Page 8: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

SOURCESFRAME DATA ESTIMATION OUTPUTSAll National to Small Area

Beyond 2011 – statistical options

Population Data

Socio demographicAttribute Data

Address

Register

Household

Communal

Maintained national address gazetteer – provides frame for

population data & surveys

Population estimates

Attribute estimates

InteractionalAnalysis

E.g. TTWA

Longitudinaldata

Household structure etc

CENSUS

Adjusting for

Adjusting for non response

CoverageAssessment

incl. under & over-coverage- by survey and admin data?

missing data and error

bias in survey (or sources)

Qualitymeasurement

Population distribution provides weighting

for attributes

Socio demographic

Survey(s)

Admin Source

Admin Source

Admin Source

Commercial sources?

Comm Source

??

increasing later?

Surveys to fill gaps

Page 9: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Potential data sources

• Population data• NHS Patient Register• DWP/HMRC Customer Information System• Electoral roll (> 17 yrs)• School Census (5-16 yrs)• Higher Education Statistics Agency data (Students)• Birth and Death registrations

• Socio-demographic sources• Surveys• DVLA?• Commercial sources?• Utilities?• TV licensing?

Page 10: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

DWP CIS population counts compared with ONS Mid Year population estimates

Page 11: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Patient Register population counts compared with ONS Mid Year population estimates

Page 12: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Electoral Roll population counts compared with ONS Mid Year population estimates

Page 13: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Higher Education StudentsCustomer Information SystemCoverage Of Main Administrative Sources

Extras includes:Some duplicatesInternational students on short-term coursesStudents ceased studying, not formally deregistered

Extras includes:Short-term migrant children

Missing includes: Under 17s Ineligible votersNon responders

Missing includes: Non school aged peopleIndependent school childrenHome schooled children

Missing includes:

Some migrant worker dependants

Some international students

Undocumented asylum seekers

Missing includes: Migrants not (yet) registeredNewborn babiesSome private only patients

Missing includes:Non higher education studentsIndependent University students

Extras includes:Some duplicatesSome ex-patsSome deceasedShort-term migrants

Extras includes:Multiple registrationsSome ex-patsSome deceased Short-term migrants

Extras includes:Some ex-patsSome deceasedShort-term migrants

Missing includes: Non-driversUnder 17’sSome foreign-licence holders

Extras includes:Some ex-patsSome deceased

UK Driving Licence

Resident PopulationCIS

PRD

Electoral RollPatient Register DataSchool Census

SCER SC

ER

DVLADVLAHESA

CISPRD

Page 14: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Key risks of non census alternatives

• Public opinion• Technical challenge• Changes in administrative datasets• UK harmonisation• Getting a decision

Page 15: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Key mathematical challenges

• Methods for Production of statisticsCoverage assessment and adjustmentData matchingCorrecting for missing dataSmall area population attribute modelling

• Methods for Protection of confidentialityData pre-processing and encryptionStatistical Disclosure Control

• EvaluationQuantifying financial benefitsDefining what is an ‘acceptable’ level of quality

Page 16: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Coverage assessment

• How many fish in your pond?Day 1, catch 100, tag them, put them backDay 2, catch 50, find 25 already taggedHow many fish in your pond?

• Answer: 200 (ish)According to day 2, half in the pond are markedWe marked 100, so there must be about 200 altogether

• “Dual System Estimation”

Page 17: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Application to the census

• We ‘fish’ twice, in 1% of postcodesCensusThen census coverage survey (CCS) 6 weeks later

• No need for tagsThey have names, addresses, dates of birthWe match the two separate lists of people (500k) to

work out• What percentage of people in the CCS had first been

‘caught’ in the census • Thus, the total population in each postcode

Page 18: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Coverage adjustment

• Apply the adjustment factor to the other 99% of postcodes where we did no CCSWith appropriate stratification

• Add ‘synthetic’ recordsExtra householdsExtra peopleWith the right key characteristicsIn roughly the right locationsUsing ‘Donor imputation’ to complete each recordSo that all the final tables add up to the right number

Page 19: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Dual system estimation - formulae

Counted By CCS?Yes No TOTAL

Counted Yes n11 n10 n1+

By Census? No n01 n00 n0+

TOTAL n+1 n+0 n++

Total population n++ = n1+ n+1

n11

• We can make life very complicated for people who aren’t mathematicians!

Page 20: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Application to administrative data

• Administrative data sources also have undercount

• But the bigger problems are due to time lags- Emigration; deaths

Results in overcount in administrative sources- Internal migration

Results in people recorded in the wrong location - overcount in one area, undercount in

another• Just applying Dual System Estimation would

result in significant over-estimation

Page 21: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Potential overcount estimation approaches (1)

• Redesigned coverage survey asking:who usually lives here?when did you move in?where are you registered to vote?where are you registered with a GP?who lived here before you?where do they live now?does John Smith still live here?

• Increasing sensitivity• Reducing appropriateness / legality

Page 22: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Potential overcount estimation approaches (2)

• Match new coverage survey to admin data• Measure coverage patterns, develop models• Intermediate model

Match records only in CS postcodes• Full linkage model

Match records in all sources across all postcodes Keep records if same location on all datasets

=> more likely to be correct • Particularly if recently recorded ‘activity’

Develop intelligent rules to resolve residual recordsReduces scale of overcount - but increases undercount

Page 23: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Small Area Estimation

• Surveys only give sufficient precision at relatively high levels of geography

• Users require information at lower levelsCensus ‘output area’ ~ 125 households / 300 people

• SAE - family of methods to increase precision of survey estimates at lower geographies by “borrowing strength” from other, more detailed

data sources, or neighbouring areas• Widely used by National Statistical Institutes

e.g. unemployment, income, households in poverty- but generally univariate, estimating means

Page 24: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

CVs Sample size= 1,000,000 people

Prevalence

0.5% 1% 5% 10% 15% 20% 50% Population size

National

50,000,000 1.4% 1.0% 0.4% 0.3% 0.2% 0.2% 0.1%

Region 5,500,000 4.3% 3.0% 1.3% 0.9% 0.7% 0.6% 0.3%

LA

150,000 25.8% 18.2% 8.0% 5.5% 4.3% 3.7% 1.8%

LA (small)

50,000 44.6% 31.5% 13.8% 9.5% 7.5% 6.3% 3.2%

MSOA (avg)

7,200 117.6% 82.9% 36.3% 25.0% 19.8% 16.7% 8.3%

MSOA (min)

5,000 141.1% 99.5% 43.6% 30.0% 23.8% 20.0% 10.0%

LSOA (avg)

1,600 249.4% 175.9% 77.1% 53.0% 42.1% 35.4% 17.7%

LSOA (min)

1,000 315.4% 222.5% 97.5% 67.1% 53.2% 44.7% 22.4%

OA

300 575.9% 406.2% 178.0% 122.5% 97.2% 81.6% 40.8%

Ward (Eng)

7,000 119.2% 84.1% 36.8% 25.4% 20.1% 16.9% 8.5%

Ward (Wales)

3,500 168.6% 118.9% 52.1% 35.9% 28.5% 23.9% 12.0%

Precision of direct survey outputs

Page 25: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Potential components

• (Very?) Large survey• Administrative sources

aggregate (area based) or unit recordavailable for lower geographic levels than survey

outputs• Possible models

Generalised Linear Models (GLM):multi-level modelsspatial / temporal extensions can add powerBayesian or frequentist estimation frameworks

Micro-simulation

Page 26: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Small area modelling - issues

• Quality of ancillary data is absolutely critical• Most existing applications use census

covariates• More powerful models incorporate time and

space effects, but are more complex• Every variable is different, and requires

different models• There’s often no substitute for geography as

a predictor‘similar people gather in similar areas’

• BUT clear academic view – the methods exist, it just depends on data

Page 27: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

2015 2016 2017 2018 2019 2020 2021 2022 2023

populationestimates

populationcharacteristics

outputs

detaileddesign

procure /develop

develop /test

ADMIN DATA SOLUTION

2015 2016 2017 2018 2019 2020 2021 2022 2023

detaileddesign

procure / develop

develop /test rehearse run outputs

TRADITIONAL CENSUS SOLUTION

2011 2012 2013 2014

research /definition

initiation

BEYOND 2011‘Phase 1’

Sept 2014 recommendation& decision point

Beyond 2011 - Timeline - the key decision

Page 28: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

2015 2016 2017 2018 2019 2020 2021 2022 2023

populationestimates

populationcharacteristics

outputs

detaileddesign

procure /develop

develop /test

2011 2012 2013 2014

research /definition

initiation

2024

addressregister

adminsources

required on an ongoing basis – ideally the National Address Gazetteer – subject to confirmation of quality

public sector & commercial ?

developing over time

coveragesurveys testing continuous assessment

attributesurveys

info from existing surveys – e.g. labour force survey, integrated household survey etc

supplemented by new targeted surveys as required

modelling increasing modelling over time

Beyond 2011 - Timeline (non census solution)

test

linkageincreasing linkage over time

Page 29: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

2027 2028 2029 2030 2031 2032 2033 2034 20352024 2025 2026 2036

address register required on an ongoing basis

administrative sources will change and disappear and be added & develop over time

continuous coverage survey

existing surveys

increasing linkage over time

increasing modelling over time

need for attribute surveys declines over time ?

2037 2038

regular production of population and attribute estimatesongoing methodology refinement

Beyond 2011 - and into the future

Page 30: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Improving quality & quantity

accuracy of population estimates

accuracy of characteristics estimates

range of topics

small area detail

multivariate small area detail

experimental statistics develop to become national statistics

2013 2031 2021

Page 31: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Statistical benefit profile

2011 2021 2031 2041

Ben

efit

Census Alternativemethod

loss

gain

loss

gain

Page 32: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Cost profile (real terms)

2011 2021 2031 2041

Cos

t

Census

???Alternative method

Page 33: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Next steps

• Research potential methods and models• Using census data

To understand coverage patterns in admin dataTo simulate new survey designsAs a gold standard – how well can we replicate census

results?• Assess quality, costs, benefits, risks• Discuss with stakeholders (!)• Public acceptability research• Report progress every six months• Make recommendations in 2014

Page 34: Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton

Advice and assistance very welcome!


Recommended