+ All Categories
Home > Data & Analytics > Managing and Analyzing Global Health Data

Managing and Analyzing Global Health Data

Date post: 27-Jan-2015
Category:
Upload: institute-for-health-metrics-and-evaluation-university-of-washington
View: 109 times
Download: 2 times
Share this document with a friend
Description:
 
Popular Tags:
16
UNIVERSITY OF WASHINGTON Managing and Analyzing Global Health Data Seattle, August 30, 2011 Peter Speyer, Director of Data Development
Transcript
Page 1: Managing and Analyzing Global Health Data

UNIVERSITY OF WASHINGTON

Managing and Analyzing

Global Health Data

Seattle, August 30, 2011

Peter Speyer, Director of Data Development

Page 2: Managing and Analyzing Global Health Data

IHME Background

• Global institute dedicated to providing independent, rigorous, and scientific measurements and evaluations to accelerate progress on global health

• Part of the Department of Global Health at the University of Washington

• Funded by the Bill & Melinda Gates Foundation and the state of Washington (“core funding”), and other funders through specific research grants

• Created in 2007

• 70 researchers, 30 staff

2

Page 3: Managing and Analyzing Global Health Data

IHME Mission

Our goal isto improve the health of the world’s populations

by providing the best informationon population health

3

Page 4: Managing and Analyzing Global Health Data

4

Page 5: Managing and Analyzing Global Health Data

Health-related data

• Social determinants• Risk factors

Health data

5

Population-based data

• Household/facility surveys• Census• Vital registration• Registries (provider,

disease)

Facility-based data

• Health records• Administrative data

(financial, operational)• Research data (DSS,

clinical trials, etc.)

Individual-based data

• Personal health records• “Quantified self”• Disease-based social

networks

Health Data Innovation

Patient engagementOpen data

Health apps

Page 6: Managing and Analyzing Global Health Data

Key health data challenges

6

Find & access

data

Dissemi-natedata

Use data

Page 7: Managing and Analyzing Global Health Data

Key health data challenges

• Lack of transparency

• Timeliness of data

• Lack of documentation• Access vs. privacy

7

Find & access

data

Dissemi-natedata

Use data

Page 8: Managing and Analyzing Global Health Data

Key health data challenges

• Sheer quantity of data files (30TB, 20K+ source datasets, 40M files)

• Diverse source data types and formats (pdf, csv, SPSS, CSPro,…)

• Data quality issues

8

Find & access

data

Dissemi-natedata

Use data

Page 9: Managing and Analyzing Global Health Data

Key health data challenges

• Make results data engaging

• Accountability: share results, code, source data

• Accommodate diverse audiences (expertise, geographies)

9

Find & access

data

Dissemi-natedata

Use data

Page 10: Managing and Analyzing Global Health Data

Example: Global Burden of Disease

Mortality & causes of death

• Sources: census, surveys, vital registration, verbal autopsy

• Estimates: covariate models, spatial-temporal regressions; weighted combination of models

Morbidity

• Sources: Literature reviews, surveys, registries,hospital data

• Disease modeling: compartmental Bayesian model

• Health severity weights

Burden of disease

• DALYnator

10

300 diseases

40 risk factors

21 regions

1990, 2005, 2010

Page 11: Managing and Analyzing Global Health Data

GBD Country Years, Causes of Death 1950-2009

11

Page 12: Managing and Analyzing Global Health Data

GBD Country Years, Causes of Death 1950-2009

12

Data source Countries Site-years # of Deaths

VR 128 4,190 722,267,710

Household surveys 136 2,827 10,132,976

Surveillance systems 12 126 717,698

National VA 21 71 301,855

Subnational VA 59 442 2,606,815

Mortuary registries 6 25 54,316

TOTAL 7,680 735,564,116

Page 13: Managing and Analyzing Global Health Data

Solutions: computing infrastructure

• Analysis with statistical packages

– Projects with 100K+ lines of code

• File system

– 60TB disk space

– Redundant backup

• Cluster with 63 nodes (+300% in 2011), ~2000 cores

– Runs 24x7, very little downtime

• Virtual environments to test new applications, servethem to collaborators, etc.

13

Page 14: Managing and Analyzing Global Health Data

Solutions: Global Health Data Exchange

• Transparency => data catalog• Access => data repository• Information => data community (future)

• One record per dataset• Standardized metadata• Internal users (10K records): files on file server• External users (5K records): files for download

• CMS: Drupal • Search: SOLR

14

Objectives

Approach

Implementation

Page 15: Managing and Analyzing Global Health Data

15

Page 16: Managing and Analyzing Global Health Data

UNIVERSITY OF WASHINGTON

Thank you!

[email protected]@peterspeyer

www.ghdx.org

Peter Speyer

Director of Data Development


Recommended