+ All Categories
Home > Documents > Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from...

Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from...

Date post: 19-Jan-2016
Category:
Upload: kerry-morton
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst. Prof. EMSE http://www.seas.gwu.edu/~broniatowski
Transcript
Page 1: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

Bringing Together the Social and Technical in Big Data Analytics: Why You Can't

Predict the Flu from Twitter, and Here's How

David A. BroniatowskiAsst. Prof. EMSE

http://www.seas.gwu.edu/~broniatowski

Page 2: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

PUBLIC HEALTH CYCLE

Population Doctors

Surveillance

Intervention

Page 3: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

• Traditional mechanisms

• Surveys

• Clinical visits

REQUIRES:DATA ON THE POPULATION

This has limited research

Page 4: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

TWITTER• Short messages (140 chars) posted to public internet

• Content: news, conversation, pointless babble

• Huge volume

• 500 million a day

Page 5: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

WHY TWITTER?

• Huge volumes of data

• A constant stream of small updates

• Nothing like waiting in line to buy cigarettes behind a guy in a business suit buying gasoline with ten dollars in dimes

• I eat pizza too much

• I'm at Cvs Pharmacy (117th and kendall, Miami)

Page 6: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

INFLUENZA SURVEILLANCE

Page 7: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

INFLUENZA SURVEILLANCE

• CDC has nationwide surveillance network with 2700 outpatient centers reporting

• ILI: influenza-like illness

• Cons:

• Slow (2 weeks)

• Varying levels ofgeographicgranularity

Page 8: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

TWITTER SURVEILLANCE

• Twitter influenza surveillance must be

• 1) Accurately track ground truth

• Identify infection tweets

• 2) Effective at both municipal and national level

• Expand tweet geolocation and evaluate municipal accuracy

• 3) Predictive in real time

• Deploy previously trained system on this flu season

Page 9: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.
Page 10: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.
Page 11: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

PIPELINE CLASSIFIERS

• Three steps using supervised machine learning+NLP

• Step 1: Identify health tweets

• Step 2: Identify flu related

• Step 3: Awareness vs. infection

Page 12: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

TWITTER SURVEILLANCE

• Twitter influenza surveillance must be

• 1) Accurately track ground truth

• Identify infection tweets

• 2) Effective at both municipal and national level

• Expand tweet geolocation and evaluate municipal accuracy

• 3) Predictive in real time

• Deploy previously trained system on this flu season

Page 13: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

LOCAL EFFECTIVENESS

• Current work focuses on US national flu rates

• Useful surveillance needed by region/state/city

• How can Twitter track local trends?

• Is it accurate?

• Is there enough data?

• Only about 1% of Twitter is geocoded

Page 14: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.
Page 15: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

CARMEN(Dredze et al., 2013)

• Over 4000 known locations (countries, states, counties, cities)

• Geocordinates only: ~1%

• Expanded locations: ~22%

• Available in Python and Java

Page 16: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

TWITTER SURVEILLANCE

• Twitter influenza surveillance must be

• 1) Accurately track ground truth

• Identify infection tweets

• 2) Effective at both municipal and national level

• Expand tweet geolocation and evaluate municipal accuracy

• 3) Predictive in real time

• Deploy previously trained system on this flu season

Page 17: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

SURVEILLANCE RESULTSPearson

Correlation 2009 2011

Keywords 0.97 0.646

Flu Classifier 0.97 0.519

Google Flu Trends

0.97 0.897

Infection 0.972 0.7832

Page 18: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

GOOGLE FLU TRENDS GETS IT WRONG?Lohr, S. (2014). Google flu trends: the limits of

big data. New York Times.

Page 19: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

Pearson Correlation:

Keywords: 0.75Infection: 0.93

Page 20: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.
Page 21: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

• ILI counts:

• Infection: 0.88

• Keywords: 0.72

BLIND EVALUATION

Page 22: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

2013-20140.95 Correlation

Page 23: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.
Page 24: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

MOST RECENT DATA

Broniatowski, D. A., Dredze, M., Paul, M. J., & Dugas, A. (2015). Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study. JMIR Public Health and Surveillance, 1(1), e5.

Page 25: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

PREDICTING ACTUAL FLU IN BALTIMORE

Broniatowski, D. A., Dredze, M., Paul, M. J., & Dugas, A. (2015). Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study. JMIR Public Health and Surveillance, 1(1), e5.

Page 26: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

HEALTHTWEETS.ORG

Page 27: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

HEALTHTWEETS WORLDWIDE

Page 28: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

Some Other Projects

David A. BroniatowskiAsst. Prof. EMSE

http://www.seas.gwu.edu/~broniatowski

Page 29: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

29

BIG DATA FOR GROUP DECISION MAKING: EXTRACTING SOCIAL NETWORKS FROM FDA ADVISORY PANEL

MEETING TRANSCRIPTS

(Broniatowski & Magee, 2013 American Journal of Therapeutics; Broniatowski & Magee, 2012 IEEE Signal Processing Magazine; Broniatowski & Magee, in preparation)

Page 30: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

“GERMS ARE GERMS” AND “WHY NOT TAKE A RISK?”

MODELS AND DATA FOR RISKY DECISION MAKING IN THE ED

(Broniatowski, Klein, & Reyna, in press, Medical Decision Making Broniatowski & Reyna, in preparation)

Page 31: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

HOW DO WE DESIGN SYSTEMS TO USE INFORMATION FLOW TO OUR ADVANTAGE?

We would like to deepen our intuitionregarding system architectures

(Broniatowski & Moses, in preparation)

Page 32: Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

32

QUESTIONS?• Big data

• Influenza tracking and coupled contagion

• Group decision-making

• Individual decision-making

• Formal models

• Medical and engineering applications

• Formal and mathematical models

• Systems architecture

• Design for flexibility

[email protected]


Recommended