Visual Analytics of Infectious Diseases
Daniel Janies and John Williams1) Department of Bioinformatics and Genomics,
University of North Carolina at Charlotte
1) Ribarsky Center for Visualization Analytics
Contact info: [email protected]
Acknowledgement: This work was supported by the Defense Threat Reduction Agency Contract HDTRA1-16-C-0010 shared with Umit Catalurek’s group at
Georgia Tech.
3/26/2018 2
Outline
• Applications and use cases:
• Antimicrobial Resistance Dashboard
• Pathogen Dynamic Graph
• Virtual Globes and Phylogenetic Trees
3/26/2018 3
AMR Data Sources
• Pathogenic and antimicrobial resistant (AMR) bacterial strainsare being sequenced by GenomeTrakr and collaborators andsubmitted to the Pathogen Detection DB at NCBI.
• Although these genetic data are not fully annotated, they areswept for genes that confer antibiotic resistance.
• Rich metadata is provided as well. For example: strain, collectiondate, host, isolation source, and locality.
3/26/2018 4
Live Demo of theAntimicrobial Resistance
Dashboard
3/26/2018 5
Surveillance of Antimicrobial Resistance Dashboard
• We analyze these data in an interactive web app created for the Biosurvellance Ecosystem at DTRA.
3/26/2018 6
The user selects a pathogen and a single or multiple genotypes.
The application then displays a choropleth map with the percentage of samples containing the gene of interest over a specific time period.
3/26/2018 7
A box plot chart allows the user to observe the distribution and variation of data between subgroups for the selected time period.
A line chart allows the user to observe trends in the percentage of resistant samples by geographic region compared to the worldwide average.
3/26/2018 8
9
Live demo ofHistograms
3/26/2018
Multi-drug resistance (MDR) can be observed via frequency distribution charts for a pathogen, over time spans and geographic ranges of interest.
The frequency distribution of unique AMR genes in Enterobacter samples isolated in the Americas from 2010 to
2017
The frequency distribution of unique AMR genes in Enterobacter samples isolated in the Americas between
2000 and 2009.
3/26/2018 10
AMR genes in E. coli isolated in the Americas between 1979 and 2017.
The user can also see the data for the pathogens isolated for each bar in the chart.
3/26/2018 11
• Monitoring the frequency distribution of genes within samples is critical to understanding the spread of multidrug-resistant bacteria.
• We have brought these raw data into an easy to use MDR surveillance application that allows the analyst to make comparisons over large swaths of data and/or to drill down to individual samples.
3/26/2018 12
Antibiograms• Tables of phenotype data of pathogens for
antibiotic resistance traditionally used in infection control.
• Data is often aggregated over time, patient population, and sample type.
• Thus antibiograms do not readily expose trends or leverage metadata
https://www.safetyandquality.gov.au/wp-content/uploads/2016/03/InfoSheet3-WhatisanAntibiogram.pdf
3/26/2018 13
For example lets say the antibiogram indicates that only 12% E. coli strains are susceptible to Ampicillin (am).
An alternative antibiotic is Ceftiofur (“cf”) because ~93% of E.coli strains are susceptible to it.
However, metadata can suggest further investigation of antibiotic selection based on travel history.
Top of list antibiotics to which E. coli tends to be resistantAmpicillin (am) 12.7 %SCeftriaxone (cr) 28.1 %SCiprofloxacin (ci) 40.2 %STetracycline (te) 42.0 %S
Lower end of list of antibiotics to which E.coli tends to be sensitive
Streptomycin (st) 84.0 %SMeropenem (m) 85.5 %S
Imipenem (i) 86.7 %SCeftiofur (cf) 92.91 %S
"Distribution Statement A. Approved for public release; distribution is unlimited
143/26/2018
Ceftiofur resistant E.coli has been found in Shenzhen China. Using a tool we created called the Pathogen Dynamic Graph, the analyst can leverage location metadata of the Ceftiofur resistant E.coli, suggesting that a different antibiotic would be best when considering recent travel history including Shenzhen China.
Ceftiofur (cf)
E.coli
Legend
Organism
Resistance Profile
Location
3/26/2018 15
Pathogen Dynamic Graph (PDG)• There are mountains of genomic data, phenotype data,
metadata, and papers on pathogens collected across the world.
• Using the PDG, a user gets a broad overview of the diversity and relationships of metadata and genes.
• By navigating the PDG the user can readily understand how bacteria and viruses spread over time, space, and various hosts.
3/26/2018 16
17
Live Demo of thePathogen Dynamic Graph
3/26/2018
Drug of last resort (e.g. Colistin)
• Tried after all other drug options have failed to produce an adequate response in the patient.
• Used outside of extant regulatory requirements or medical best practices.
• Colistin is a decades-old drug that has kidney toxicity. It remains one of the last-resort antibiotics for multidrug resistant bacteria.
3/26/2018 18
The PDG allows the analyst to map the traffic ofplasmid mediated transfer of mcr-1, whichconfers resistance to colistin, across variousbacteria, infecting various hosts, in variouslocations.
Disease
Organism
Host
Protein
Gene
Location
Legend
3/26/2018 19
The PDG allows the user to understand different means of spread of colistin (e.g. E. coli in agricultural animals and Humans vs.
Klebsiella in Humans.)
3/26/2018 20
21
Live demoRadar plots to organize antimicrobial resistance phenotype data within antibiotic classes served in the context of the
Pathogen Dynamic Graph
3/26/2018
Antibiotic resistance profiles for Escherichia coli
223/26/2018
Virtual Globes and Phylogenetic Trees
23
Chicago
A
B
Toronto
C
Washington, DC
Three cases of a novel infectious disease:
If we sequence the genomes of the pathogens the three outbreaks can be interconnected and understood via their connections to background data
Where did the pathogen originate ?
From Asia? From Europe?
From South America and the Caribbean
Or did the pathogen originate in North America and will it spread abroad ?
To Asia ? To Europe?
To South America and the Caribbean
28
Or in the case of Zika virus, from South America and the Caribbean via the Pacific
The vectors for Zika virus• Aedes aegypti• Aedes albopictus• Other Aedes and perhaps other mosquitos in Africa• Homo sapiens
• blood transfusion• sexual contact• saliva, tears
https://en.wikipedia.org/wiki/Aedes_aegypti#/media/File:Aedes_aegypti.jpg
History of Pathogenicity of Zika Virus
• After-Yap (2007-today)• Pre-Yap symptoms• Microcephaly / Zika Fetal Syndrome• Guillain-Barré Syndrome
• Pre-Yap (1967 – 2007)• Fever• Skin rashes• Body pain
https://www.cdc.gov/ncbddd/birthdefects/images/microcephaly-comparison-500px.jpg
31
Zika traveled to Asia from Africa and began to change the spectrum of diseases it causes
White lineagesBenign (fever, rash)
Yellow lineagesSevere (Microcephaly and other birth defects, Guillain-Barré syndrome)
Phylogeny of Zika VirusMolecular evolution of Zika virus as it crossed the Pacific to the AmericasAdriano de Bernardi Schneider Robert W. Malone Jun-Tao Guo Jane Homan Gregorio Linchangco Zachary L. Witter Dylan Vinesett, LambodharDamodaran Daniel A. Janies.First published: 12 December 2016, https://doi.org/10.1111/cla.12178
Zika’s Journey
Pacific
Africa
Americas
Asia
“Out-of-Africa hypothesis”
“Africa/Asia hypothesis ”
PacificAfrica
Americas
Asia
Asia
AsiaAsia Asia
Molecular Evolution of the Zika Virus
Outbreak- 2012
Outbreak– 2015-16
Outbreak- 2007
What happened here?
Molecular evolution of Zika virus as it crossed the Pacific to the AmericasAdriano de Bernardi Schneider Robert W. Malone Jun‐Tao Guo Jane Homan Gregorio Linchangco Zachary L. Witter Dylan VinesettLambodhar Damodaran Daniel A. JaniesFirst published: 12 December 2016 https://doi.org/10.1111/cla.12178
3’UTR 3’UTR
9bp – U > C 258bp – U > C
42bp – U > C 266bp – A > G
66bp – A > G 275bp – C > A
97bp – U > C 394bp – C > U
98bp – C > U 425bp – U > G
192bp – A > G 427bp – U > C
257bp – C > U 428bp – C > U
African lineages
Asia-Pacific-Americas cladePre-MBE
Molecular evolution of Zika virus as it crossed the Pacific to the AmericasAdriano de Bernardi Schneider Robert W. Malone Jun‐Tao Guo Jane Homan Gregorio Linchangco Zachary L. Witter Dylan VinesettLambodhar Damodaran Daniel A. JaniesFirst published: 12 December 2016 https://doi.org/10.1111/cla.12178
Zika3’ UTR Alignment in African Asia Pacific Americas Clade
36Zika Fetal Neuropathogenesis: Etiology of a Viral SyndromeZachary A. Klase, Svetlana Khakhina, Adriano De Bernardi Schneider, Michael V. Callahan, Jill Glasspool-Malone, Robert Malone Published: August 25, 2016https://doi.org/10.1371/journal.pntd.0004877
3’UTRs of different Lineages of Zika Virus
African lineage Asia/Pacific/ Americas lineage
MBEMBE
Molecular evolution of Zika virus as it crossed the Pacific to the AmericasAdriano de Bernardi Schneider Robert W. Malone Jun‐Tao Guo Jane Homan Gregorio Linchangco Zachary L. Witter Dylan VinesettLambodhar Damodaran Daniel A. JaniesFirst published: 12 December 2016 https://doi org/10 1111/cla 12178
Musashi Binding Elements (MBE)
2015 - Zika targets cerebral neural precursors – cause unknown
2017 - Musashi-1 interacts with Zika genome and enables viral replication
The Musashi is a family of RNA binding proteins that regulate multiple stem cell populations.
Early 2016 - MBE’s are first described in literature associated with Zika
Late 2016 - MBE’s are studied in an evolutionary context with associated predicted secondary structure and enhanced binding energies in Asia Pacific Americas lineage of Zika virus.
Americas March 2016
Americas August 2016
http://www.cdc.gov/zika/intheus/maps-zika-us.html
Local Zika Virus transmission in Florida (August – September 2016)
Historically, two species of Aedes have been recorded from our area (Aedes aegypti and Aedes albopictus) (GBIF data).
Summer 2017 Mecklenburg, County, NC; 99% Aedes albopictus (Ari Whiteman UNC Charlotte)
https://ecdc.europa.eu/en/disease-vectors/facts/mosquito-factsheets/aedes-albopictus
3/26/2018 43
West Nile virus Europe, Africa, Asia, Australia, North America
Usutu virus Africa, EuropeDengue virus circumtropicalTahyna virus Europe, Asia, AfricaZika virus * Americas, Southeast Asia, AfricaChikungunya virus * circumtropical
Some viruses carried by Aedes albopictus and their distribution
*varies in geography, depends on viral strain and mosquito lineage, thus we have to remain vigilant and sample across space and time
Summary• These applications allow analysts to rapidly handle very large,
diverse datasets on the spread of pathogens and their properties.
• The results are visual analytics that lead to actionable conclusions that can be readily communicated across disciplines.
• The applications provide means to navigate large raw datasets contributed by labs all over the world.
• As such these applications enable coordinated efforts in infection control and biosurveillance.
3/26/2018 44
Acknowledgements
Organizers of Analytics Frontiers:UNC Charlotte DSI
Bank of America
Science & Technology Managers:
Chris Kiley and Ed Argenta (Defense Threat Reduction Agency Contract HDTRA1-16-C-0010).
Genome Trakr network anddata sharers around the world
453/26/2018