Date post: | 15-Apr-2017 |
Category: |
Data & Analytics |
Upload: | david-millson |
View: | 298 times |
Download: | 1 times |
Using Twitter to predict Norovirus
outbreaks
David Millsonon behalf of Callum Staff
THE PROJECT
© 2015 Food Standards Agency
Predicting and reducing Norovirus
• Tweets discussing Norovirus and its symptoms were first identified as a
proxy indicator for Norovirus cases through an MSc project
• Outbreaks are predicted using rises in tweets about Norovirus
symptoms (diarrhoea and vomiting)
• Predictions are used to inform FSA and NHS Choices interventions
• Interventions can prevent outbreaks from getting out of control by
encouraging sufferers to stay at home and avoid passing the bug on
© 2015 Food Standards Agency
Citizen Need / Business Need
• Citizen Need:
• More timely surveillance = quicker reactions = more cases prevented
• Lower burden on economy and public
• Business Need:
• To understand the predictive power of social media data
• Demonstrate to Government the value of including social media
analysis in surveillance strategy
© 2015 Food Standards Agency
FSA and social media analysis – from little acorns
• Cabinet Office approached the FSA to pilot a joint project with Ipsos
MORI using machine learning to categorise Tweets
• Led in producing cross-government guidance on social media research
• Set up a review and innovation group to bring the expertise of industry
and academia to Government social media research
• Designed and presented workshop on using social media in policy and
analysis
BUILDING THE MODEL
© 2015 Food Standards Agency
Crowd-sourcing the keywords
© 2015 Food Standards Agency
Excluding bad keywords
© 2015 Food Standards Agency
Do people really tweet when they have Norovirus?
© 2015 Food Standards Agency
The trade-off between usefulness and rigour
• We can rigorously predict Norovirus cases three weeks after they
happen
• To be useful to Communications, we need to predict them three weeks
before they happen
Tweets
Community cases
© 2015 Food Standards Agency
The trade-off between usefulness and rigour
• We can rigorously predict Norovirus cases three weeks after they
happen
• To be useful to Communications, we need to predict them three weeks
before they happen
Tweets
Community cases
© 2015 Food Standards Agency
Calibrating the Cut Off Value
© 2015 Food Standards Agency
Calibrating the Cut Off Value
0.35
© 2015 Food Standards Agency
Calibrating the Cut Off Value
0.30
© 2015 Food Standards Agency
Calibrating the Cut Off Value
0.25
© 2015 Food Standards Agency
Calibrating the Cut Off Value
0.20
© 2015 Food Standards Agency
Calibrating the Cut Off Value
0.25
USING THE RESULTS
© 2015 Food Standards Agency
Outbreak predicted
Outbreak reduced (hopefully)
VALUE FOR MONEY
© 2015 Food Standards Agency
Value for Money
• Cost of the project
• One analyst working approx. one day a week for 2/3 year ~ £2,500
• NHS Choices spend/external research brings total to approx. £20,000
• Cost of Norovirus
• Estimated 2.8 million cases in the UK a year at a cost of £120 million
WOULD NEED TO PREVENT JUST 500 CASES A YEAR, OR 0.02% OF
THE TOTAL, TO BE PROVIDING VALUE FOR MONEY
NEXT STEPS
© 2015 Food Standards Agency
Spatial Mapping
© 2015 Food Standards Agency
Summary
• A marriage of “supply” and “demand”:
• Twitter identified as a measure of Norovirus, providing information
much more rapidly than lab reports
• A need to roll out public information on Norovirus at the right time to
make the biggest impact
• A gateway project to demonstrate the value of social media analysis
• Low cost and therefore low risk, with potentially high rewards