Statistical Modelling for Official Migration Statistics:
State of the Art and Perspectives
DIME/ITDG Plenary | Eurostat | Luxembourg | 14 February 2017
Jakub Bijak [email protected]
William Davies, in The Guardian, 19 January 2017
Outline
• Background and policy needs
• Current state of the art
• Promising applications
• Selected examples
• Perspectives
Outline Background State of the Art Applications Examples Perspectives
Outline Background State of the Art Applications Examples Perspectives
Fot: Mstyslav Chernov [CC BY-SA 4.0], via Wikimedia Commons
Importance
Outline Background State of the Art Applications Examples Perspectives
Background
Art. 9 “scientifically based and well documented statistical estimation methods may be used” for official statistics on migration and asylum
Methods
Design-based Model-based
Chiefly frequentist Chiefly Bayesian
Strength: Model formulation
Strength: Inference given model
Challenge: Coherent inference
Challenge: Model can be wrong
Outline Background State of the Art Applications Examples Perspectives
After: R J Little (2006) American Statistician, 60(3): 213–223.
Policy needs
Usefulness
Readiness Timeliness
Outline Background State of the Art Applications Examples Perspectives
Policy needs
• Trade-offs: generally two out of three
–Administrative data: timely and ready
–Big data, potentially: useful and timely
–Design-based surveys: ready and useful
• Can model-based approaches help?
Outline Background State of the Art Applications Examples Perspectives
State of the art • Sources of data on flows in EU/EFTA countries
Nordic registers
Other registers
Mostly surveys
Outline Background State of the Art Applications Examples Perspectives
Map: Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Europe_political_chart_complete_blank.svg, CC-BY-SA 3.0 Meta-data: Eurostat, http://ec.europa.eu/eurostat/cache/metadata/en/migr_immi_esms.htm
State of the art
• Migration estimates mostly administrative or design-based
• Harmonisation of definitions envisaged in Regulation 862/2007 – UN ‘Gold standard’
• ‘Statistical mainstreaming of migration’ by Eurostat
– Example: LFS migration modules and questions
• Model-based work mainly academic
Outline Background State of the Art Applications Examples Perspectives
State of the art
• Discrepancies despite harmonisation efforts
Examples from the Eurostat database for 2014: good, moderate, and problematic alignment
Outline Background State of the Art Applications Examples Perspectives
Receiving country data Sending country data
To: From:
ES IT UK To: From:
ES IT UK
ES - 3,427 33,325 ES - 9,477 33,851
IT 14,781 - 17,587 IT 4,701 - 14,991
UK 17,747 3,740 - UK 18,002 : -
State of the art
• Different stakeholders – different categories
• Need for a flexible reconciliation of data
Outline Background State of the Art Applications Examples Perspectives
Migrants
3 months
Asylum seekers
Refugees
6 months
1 year
5 years
Applications
Areas crucial for migration statistics:
• Reconciliation of data sources
• Small-domain estimation
• Complexity and uncertainty
• ‘Big Data’
• Privacy and disclosure control
Outline Background State of the Art Applications Examples Perspectives
Reconciliation of sources
• Harmonisation of definitions
• Flexibility
• Administrative sources
• Survey data
• ‘Big data’
Outline Background State of the Art Applications Examples Perspectives
Small domains
• Small geographic units
– Random variability in survey data ( design)
– Non-random errors in other sources ( models)
• Short time series
– Migration projections/forecasts ( models)
• Other domains
– Cross-classification of migration and other variables ( design and models)
Outline Background State of the Art Applications Examples Perspectives
Complexity and uncertainty
• Migration is complex and uncertain
• Need to reflect that in official statistics
• Design-based approaches alone cannot achieve that:
– Complexity requires modelling
– Uncertainty far beyond sampling error
• Borrowing of strength through modelling
Outline Background State of the Art Applications Examples Perspectives
‘Big Data’
Key challenges:
• Biases in data
• Ethical issues
• The ‘4V’ volume, velocity variety, veracity See also: E Zagheni and I Weber (2012),
WebSci 2012, 52(5): 1627–50.
Outline Background State of the Art Applications Examples Perspectives
Privacy and disclosure control
• Issues especially with individual-level and linked data
• But: explicit trade-offs between disclosure risk and benefits of data
Outline Background State of the Art Applications Examples Perspectives
After: S E Fienberg (2011) Statistical Science, 26(2): 212–226.
Example: Reconciliation
• Bayesian hierarchical model for 31 countries
Outline Background State of the Art Applications Examples Perspectives
Source: J Raymer et al. (2013) Journal of the American Statistical Association, 108 (503): 804 & 811.
Example: Reconciliation
• Similar model – one country, many sources
Outline Background State of the Art Applications Examples Perspectives
Source: NG Disney (2015) PhD thesis, University of Southampton.
Example: Small areas • Design-and-model-based estimates
Source: Office for National Statistics, via www.ons.gov.uk
Outline Background State of the Art Applications Examples Perspectives
Example: Uncertainty
Outline Background State of the Art Applications Examples Perspectives
Source: Fig. 1 from JJ Azose and AE Raftery (2015), Demography, 52(5): 1627–50.
• Migration projections for the UN WPP
Example: Flows from stocks
Outline Background State of the Art Applications Examples Perspectives
Source: Guy Abel, via https://gjabel.wordpress.com See also: GJ Abel and N Sander (2014). Science, 343 (6178).
Perspectives
• Within easy reach
– Micro-level data integration
– Macro-level integration and modelling
– Inference on linked data
• Longer-term agenda
– Non-traditional sources
– Early warning systems
Outline Background State of the Art Applications Examples Perspectives
A
C B
Early warnings
Outline Background State of the Art Applications Examples Perspectives
After: J Bijak (2016) http://gmdac.iom.int/gmdac-data-briefing-6
Syrian refugees (UNHCR)
ABC of current challenges
Administrative sources
Big data
Complex problems
Outline Background State of the Art Applications Examples Perspectives
Specific issues
• New ethical and privacy concerns in Big data
• Assurance of ‘scientific validity’ of estimates
• Communication of model-based results
• Uncertainty of estimation
...but all of them are already recognised and are not unsurmountable
Outline Background State of the Art Applications Examples Perspectives
New methods
• Methodological innovation e.g. Roderick J Little’s ‘Calibrated Bayes’
Outline Background State of the Art Applications Examples Perspectives
After: R J Little (2012) Journal of Official Statistics, 28(3): 309–334.
Model: inference
Design: calibration
Calibrated Bayes
Group effort
Different levels of collaboration
• Between European countries
• Between data producers and users
• Between official statistics and academia
• Between survey statisticians and modellers
Outline Background State of the Art Applications Examples Perspectives
Final points
• Modelling can help utilise the data better
• Design and models are not contradictory: design can lead to models
• Harmonisation still needs attention
• New challenges for the future
Outline Background State of the Art Applications Examples Perspectives
John Pullinger UK National Statistician
“We are far from ‘leaving behind the age of statistics’ ... Quite the reverse. This is the moment when we can make our greatest contribution to society by providing the better statistics that allow for better decisions.”
Statistical Modelling for Official Migration Statistics:
State of the Art and Perspectives
Jakub Bijak [email protected]
DIME/ITDG Plenary | Eurostat | Luxembourg | 14 February 2017
With credit to: GJ Abel, JJ Azose, E Dodd, JJ Forster, JD Hilton, T King, SL Nurse, AE Raftery, J Raymer, PWF Smith, A Wiśniowski