Modern Epidemiology – A New Computational Science · EpiGrid 2007 Modern Epidemiology – A New...

Post on 05-Jun-2018

220 views 1 download

transcript

EpiGrid 2007

Modern Epidemiology – A New Computational Science

Facilitating Epidemiological Research through Computational Tools

Armin R. MiklerComputational Epidemiology Research Laboratory Department of Computer Science and Engineering

University of North Texas

EpiGrid 2007

Searching for the cause of Death

Epidemiology has been in existence for hundreds of years. Its beginnings were motivated by the questions:

What causes people to die ?

The answer to this question would lead to new ways to prevent some of the causes - although the inevitable still cannot be avoided.

EpiGrid 2007

Medicine ≠ Epidemiology

Dictionaries offer the following definitions:

Medicine -

the science and art dealing with the maintenance of health and the prevention, alleviation, or cure of disease.

Epidemiology -

a branch of medical science that deals with the incidence, distribution, and control of disease in a population.

Notice that "art" is missing in the definition for Epidemiology!!

EpiGrid 2007

Some Historical events in Epidemiology Epidemiologic accounts date back to the time of Hippocrates (459 -

377 B.C.) and ancient Greeks. In the 1300s, Europe lost 25% of its population of 100 million to the Black Death or Plague.

A Smallpox outbreak in 1521 eradicated half of the Aztecs Empire of 3.5 million people.

London's Cholera Epidemic in 1854.

In 1918, the Spanish Flu (pandemic Influenza) caused an excess death of over 20 million world wide.

EpiGrid 2007

Containing and Controlling the Disease A better understanding how diseases spread among the population has led to greater sophistication in methods that prevent or at least contain outbreaks.

• Quarantining entire villages.

Mapping of disease clusters (by John Snow in 1854 during London's Cholera Epidemic).

Social distancing (quarantining) was advocated during the 1918 Influenza pandemic.

Examples:

Question:How are we dealing with emerging and re-emerging diseases today?

EpiGrid 2007

Epidemiology Computational Epidemiology

Methodical Approaches

John Snow's effort to identify the source of the 1854 Cholera epidemic in London was one of the earliest applications of GIS in Epidemiology.

EpiGrid 2007

Morbidity in Context

John Snow’s map of cholera cases has led to the identification of the point source of the disease.

The Water Pumps on Broad Street!

The Geographic Information in addition to the Case Data has established a CONTEXT to display Relationships in Time and Space!

EpiGrid 2007

A motivating example! The following data was collected in a single retrospective

observational study. What was the cause of death ??

2201 (1490) (126) 470‏ Total‏(1364) 1731‏

885 (673) (670) 862‏(3) 23‏ Other‏

706 (528) (106) 196‏ (422) 510‏ Low‏

285 (167) (13) 106‏ (154) 179‏ Middle‏

325 (122) (4) 145‏ (118) 180‏ High‏

BothFemaleMaleSocial Class

Exposed (Deaths)‏

Excess deaths by gender:

EpiGrid 2007

2201 (1490) (52) 109‏ (1438) 2092‏ Total‏

885 (673) (0) 0‏ (673) 885‏ Other‏

706 (528) (476) 627‏(52) 79‏ Low‏

285 (167) (167) 261‏(0) 24‏ Middle‏

325 (122) (0) 6‏ (122) 319‏ High‏

BothChildAdultSocial Class

Exposed (Deaths)‏

Excess deaths by age group:

…more data ….no context

EpiGrid 2007

…some Geographical Information…..

EpiGrid 2007

…some temporal information….

April 14/15, 1912

EpiGrid 2007

A modern Epidemiological study….

Results of a Tuberculosis survey in Tarrant County, TX

•Problem: Insufficient Context

GIS data with greater detail!

EpiGrid 2007

The Epicenter

Pictures by Patrick Moonan, 2003

People standing outside homeless shelter

People sleeping inside homeless shelter

After identifying the homeless shelter as the epicenter, granularity of the study change again: more contextual detail

EpiGrid 2007

Different arrangements at different $ in the same shelter.

Pictures by Patrick Moonan, 2003

EpiGrid 2007

John Snows approach in the Shelter

The Map

TB Prevalence

From Spatial to Social Epidemiology!

EpiGrid 2007

Towards Computational Epidemiology

Its all about the CONTEXT!

How can we prepare for Epidemiologic Emergencies including Epidemics, Pandemics, and Bioterrorism if there is no current morbidity or mortality data available?

We may use historical or anecdotal data!

Questions:

• Can we build a model that reproduces the historic event?

• How would the event manifest itself in a modern context?•Demographics may have changed;

•Infrastructure may have changed;

•Medical practices may have changed;

EpiGrid 2007

Models in Computational EpidemiologyModels in Computational Epidemiology

From mathematical models to simulation:From mathematical models to simulation:-

Basic SIR Model based on Differential Eqs.–Dynamic System Modeling –Data Storage and Analysis–Simulation –Data visualization.

Computation Epidemiology is more than the sum of its parts: Computation Epidemiology is more than the sum of its parts: Epidemiology, Computer Science, Mathematics, Dynamic Epidemiology, Computer Science, Mathematics, Dynamic Systems, Public Health Systems, Public Health ……..

–Investigating disease outbreaks and risk assessment in spatially delineated environments –Investigating intervention strategies to control the spread of diseases–Investigating spread of disease in demographic, and geographic space!

GIS/EPI Data

Model

Visualization

EpiGrid 2007

Tools of the trade

o

Mathematical ModelsSIR, SEIR, SIS, etc.Population EcologyGraph Theoretical Models

o

Computational solutions:Simulations

Agent based•

Cellular Automata•

Stochastic Field Simulation•

Social NetworksSpatial-Temporal DatabasesGISVisualization Web InterfacesHigh Performance Computing

EpiGrid 2007

Some current issues….

The following is a collection of current problems for which computational models are being developed at CERL:

Contact Models to predict and quantify Pandemic Influenza

Infectious Disease Outbreaks in the K-12 School System

STD Spread Models: HPV & HIV

Social Network Models of Social/Intimate Relationships

Points of Distribution (PODs) Traffic Analysis

EpiGrid 2007

PODs Traffic Analysis

Federal and State funding has been used by counties to develop a comprehensive disaster preparedness plan. This plan identifies several

sites in the county at which citizens can obtain medication or vaccination in the case of a Bio / Medical disaster.

Questions:

•Can PODs sustain traffic

•Can roads sustain traffic

•Placement of PODs

•How many people can get service in how little time?

PODs

EpiGrid 2007

Waiting on line to get smallpox vaccine during New York Citysmallpox epidemic (1947) ‏

We need to experiment with intervention strategies

When, How, Who, Where –

should we vaccinate?What are the predicted outcomes of specific strategies?

How should mass-intervention be organized ?

EpiGrid 2007

The Global Outbreak ModelThe Global Outbreak Model

VaccinationPopulation

Demographics

Disease Parameters

Data Sets Visualization

Interaction factors

Distances

EpiGrid 2007

Disease ParametersDisease Parameters

Illustrates time-line for infection (influenza) ‏

Latent periodInfectious periodIncubation periodInfectivity Index case

Multiple index casesLocation of index case

EpiGrid 2007

Model ParametersModel Parameters

o

Population per cello

Demographics i.e.

Age DistributionEthnic DistributionGender Distributionetc.

o

Geography/Hotspotso

Contact Rate(s)‏

avg.when symptomatic

o

Vaccinated Populationo

Vaccine Efficacy

o

Natural Immunityo

Immune Deficient Population

o

Public Health Events

Population distribution over the North Denton region.

Total Population of 110,000 distributed over a grid size of 50 * 100.

EpiGrid 2007

The Complexity of ContactsThe Complexity of Contacts

InfectiousInfection transmits ?

Function of the infectivity parameter

Common meeting area

Contact includes

•Exposure

•Duration of exposure

•Infectivity/ Virulence of the virus (infection) ‏

•Immunity

•Age of individual

•other demographic characteristics

Contact is any interaction that facilitates successful disease transmission.

Epidemics are driven by Contacts and Exposures!

EpiGrid 2007

Social Networks determine contacts

o

Clusters indicate strongly connected subgroups.

o

Measures of Affinity

o

Who is likely to contact whom?

EpiGrid 2007

Population Distributions for Different Age GroupsPopulation Distributions for Different Age Groups

0-9 years

60+ years35-59 years

10-34 years

EpiGrid 2007

Global Stochastic Field Model demographic layersGlobal Stochastic Field Model demographic layers

IndividualProbable area of interaction

Age group 0-9

Age group 60+

Age group 35 -

59

Age group 10-34

Probability of interactions based on distances

Prob

abili

ty o

f in

tera

ctio

ns a

mon

g va

riou

s ag

e gr

oups

EpiGrid 2007

Visualizing Spatial Spread of Influenza Visualizing Spatial Spread of Influenza simulated over Northern Denton Countysimulated over Northern Denton County

Local Interaction

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

Index case

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

Global InteractionGlobal Interaction

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

EpiGrid 2007

A composite SIR Model

100 small regional outbreaks

EpiGrid 2007

Σ

Observed Epidemic

Regional outbreaks

We can extend the model fromgeographic regions to demographicsub-populations

EpiGrid 2007

Composition Model

Assumption : Sub-regions (or cells) with a larger proportion of a certain demographic may display increased or decrease prevalence of a certain disease as compared to a sub-

region with a larger proportion of a different demographic

Cell interaction is controlled by age proportions and population densities.

Composition model reflects the spread of the infection in each sub-

region.

Observed Cumulative Epidemic caused by Temporally and Spatially Distributed Local Outbreaks

EpiGrid 2007

Mathematical Modeling of Epidemics

Let β be the transmission rate based on contact rate and infectivityLet γ be the rate of infectives becoming non-infectious

RemovedR

SusceptibleS

InfectivesI

dSdt

= − β SI

dIdt

= + β SI− γI

dRdt

= + γI

• Susceptibles Infectives Removals (SIR) model• Susceptibles Exposed Infectives Removals (SEIR)‏• Susceptibles Infectives Susceptibles (SIS) ‏

EpiGrid 2007

Susceptibles Infectives Removals (SIR) model

The naïve SIR assumes:oHomogeneous mixing of people

oEvery individual makes same contacts

oNo demographics consideredoGeographical distances not considered

Things can get unwieldy when adding demographics!!

EpiGrid 2007

Composition Model -Experiment

Simulation parameters:

Disease Simulated : Influenza like disease

Incubation period : 3 days

Infectious period: 3 days

Recovery period: 5 days

Infectivity : 0.020

Contact rate/person : 11

o

The population distribution over the region is non-uniform.

o

Contacts made between cells depends on the population of the cell.

o

Assumption : Regions with high population make more contacts than regions with low population.

EpiGrid 2007

Composition Model -Experiment

Population distribution over the north Denton region.

Total Population of 110000 distributed over a grid size of 50 * 100.

Total Population infected at the end of simulation: 48000

Infected Population distribution over the north Denton region.

EpiGrid 2007

Experiment--

Immunityo

The probability of a contact with an infectious person resulting in a successful disease transmission depends on the immunity of the individual. o

Experiment was conducted considering that people residing in a particular region were immune to the particular virus as means of either vaccination or previous infection. oThe results show lower level of prevalence of disease in that region compared to other regions.

Region Immunized

EpiGrid 2007

The need for Computational Horse PowerThe need for Computational Horse Power

Large geographic region Many cells/objects

We need multiple computers to execute We need multiple computers to execute small pieces of the simulation simultaneously.small pieces of the simulation simultaneously.

Complex interaction and Multiple Populations

Large Computational Complexity

Many cells/objects Complex interactions

Many cells/objects Simulation of Multiple Populations

EpiGrid 2007

EpiGrid 2007

The Future:The Future: Clusters and the GRIDClusters and the GRID

o

Faster hardware and new high-bandwidth networks demand that we explore new cluster architectures.

o

Larger, more complex cluster environments make it imperative to invest in new efficient and scalable tools.

o

Grand Challenge

problems will continue to drive the development of computing infrastructure.

o

Distributed HPC

will become common place. (DOE SciDAC)‏

o

Management Tools

designed for single hosts or small clusters are likely NOT to scale.

o

New types of

Middleware

is needed to decouple the underlying distributed infrastructure from the applications.

EpiGrid 2007

Grid Layers…virtualizationGrid Layers…virtualization

Internet / PrivateNetworks

GridEngine

GridEngine

GridEngine

GridEngine

GridEngine

GridEngine

General Grid Services

Application-Specific Grid Services (APIs)‏

Applications

Middleware

Grid Access

i.e., ScientificDiscovery throughAdvanced Computing

Data Grid

Comp.Grid

BioGrid

EPIGrid?

EpiGrid 2007

Validate, validate, validate…

“Great Model”, “Compelling Results”, “…nice tool” …BUT….HOW DID YOU VALIDATE ITS CORRECTNES??

Problems:•No Data on Emerging Infectious Diseases to compare against•Insufficient Domain Knowledge•HIPPA & Data Privacy•Incomplete or Missing Data•Complexity, Complexity, Complexity

Much Ado About Nothing!?

EpiGrid 2007

Domain Knowledge and Expectation

Don’t kid yourself –

we do not understand the details of howsociety works and how people interact!!

We can only theorize how an epidemic might manifest itself and prepare for the worst-case scenario!

If data of previous epidemics (of same or similar disease) is available, expectations can be based on observation. HOWEVER, circumstanceshave most likely changed.

Idea: Develop computational tools that allow experts to express theirexpectations. Validate against DOMAIN EXPERTISE even if it is justa theory or hunch!

EpiGrid 2007

ConclusionConclusion

There are many different methodologies to chose from: Mathematical Models, Agent Based Models, CAs, GSFS, etc.

Chose the most appropriate modeling/simulation technique based on thedomain characteristics – Spatially Delineated, Regional, …

When developing a computational tool, keep in mind whose work is going to be facilitated!! Visualize & Parameterize & Animate

Facilitate WHAT-IF-ANALYSES and support quantification of policy and/or strategic decision making.

Validate against domain expertise if no reliable data source is available!

EpiGrid 2007