+ All Categories
Home > Documents > “Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network...

“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network...

Date post: 31-Dec-2015
Category:
Upload: randolf-webb
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
13
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1
Transcript

“Comparative Human Microbiome Analysis”

Remote Video Talk to CICESE Big Data, Big Network Workshop

Ensenada, Mexico

October 10, 2013

Dr. Larry Smarr

Director, California Institute for Telecommunications and Information Technology

Harry E. Gruber Professor,

Dept. of Computer Science and Engineering

Jacobs School of Engineering, UCSD

http://lsmarr.calit2.net1

Abstract

We are carrying out very deep metagenomic sequencing of human gut microbiomes from healthy subjects and from people with the autoimmune Inflammatory Bowel Disease. We compare one subject with IBD to metagenomic datasets downloaded from the NIH Human Microbiome Project repository, including 35 healthy subjects and 20 with IBD. We also analyze the changes in this one subject over multiple times, including comparing before and after drug therapy. The dataset of Illumina short reads for one person is ~10GB. The total comparison dataset contains ~0.5 trillion DNA bases. These Big Data had to be moved across the network to the San Diego Supercomputer Center where over 200,000 cpu-hours were consumed in the analysis and then back to Calit2 where a 64 megapixel wall was used for visual analysis. This approach could be extended for cross-border comparisons of human gut microbiomes to examine differences in food intake and various disease states.

Larry Smarr is the Harry E. Gruber Professor in the Department of Computer Science and Engineering of the Jacobs School of Engineering at UC San Diego. He was the founding director of the California Institute for Telecommunications and Information Technology in 2000 and of the National Center for Supercomputing Applications in 1985.

Weizhong Li currently leads a group of researchers funded by NIH and NSF at the Center for Research in Biological System in UC San Diego. He has more than 20 years of experience in bioinformatics, computational biology, and computational chemistry.

Your Body Has 10 Times As Many Microbe Cells As Human Cells

Inclusion of the Microbiome Will Radically Change Medicine

99% of Your DNA Genes

Are in Microbe CellsNot Human Cells

Gut Microbiome Metagenomic DatasetsComparing Healthy and Diseased States

One “Read” = 100 DNA BasesTotal of 12.5 Billion Reads!

Source: Weizhong Li, CRBS, UCSD

We Created a Reference DatabaseOf Known Gut Genomes

• NCBI April 2013– 2471 Complete + 5543 Draft Bacteria & Archaea Genomes– 2399 Complete Virus Genomes– 26 Complete Fungi Genomes– 309 HMP Eukaryote Reference Genomes

• Total 10,741 genomes, ~30 GB of sequences

Now to Align Our 12.5 Billion ReadsAgainst the Reference Database

Source: Weizhong Li, Sitao Wu, CRBS, UCSD

Computational NextGen Sequencing Pipeline:From “Big Equations” to “Big Data” Computing

PI: (Weizhong Li, CRBS, UCSD): NIH R01HG005978 (2010-2013, $1.1M)

Creating a Big Data Freeway System:Coupling ‘Omics Data Generators with Supercomputers

Using Optical Fiber with 1000x Shared Internet Speeds

We Used SDSC’s Gordon Data-Intensive Supercomputer to Analyze a Wide Range of Gut Microbiomes

• ~180,000 Core-Hrs on Gordon– KEGG function annotation: 90,000 hrs– Mapping: 36,000 hrs

– Used 16 Cores/Node and up to 50 nodes

– Duplicates removal: 18,000 hrs– Assembly: 18,000 hrs– Other: 18,000 hrs

• Gordon RAM Required– 64GB RAM for Reference DB– 192GB RAM for Assembly

• Gordon Disk Required– Ultra-Fast Disk Holds Ref DB for All Nodes– 8TB for All Subjects

Enabled by a Grant of Time

on Gordon from SDSC Director Mike Norman

Comparing 3 LS Time Snapshots (Left) with Healthy, Crohn’s, UC (Right Top to Bottom)

Calit2 VROOM-FuturePatient Expedition

Phyla Gut Microbial Abundance Without Viruses: LS, Crohn’s, UC, and Healthy Subjects

Crohn’s UlcerativeColitis

HealthyLS

Toward Noninvasive Microbial Ecology Diagnostics

Source: Weizhong Li, Sitao Wu, CRBS, UCSD

Lessons From Ecological Science:Invasive Species Dominate After Major Species Destroyed

 ”In many areas following these burns invasive species are able to establish themselves,

crowding out native species.”

Source: Ponderosa Pine Fire Ecologyhttp://cpluhna.nau.edu/Biota/ponderosafire.htm

Rare Firmicutes Bloom in Colon Disappearing After Antibiotic/Immunosuppressant Therapy

Firmicutes Families

LS Time 1LS Time 2

HealthyAverage

Parvimonasspp.

Thanks to Our Great Team!

UCSD Metagenomics Team

Weizhong LiSitao Wu

Calit2@UCSD Future Patient Team

Jerry SheehanTom DeFantiKevin PatrickJurgen SchulzeAndrew PrudhommePhilip WeberFred RaabJoe KeefeErnesto Ramirez

JCVI Team

Karen NelsonShibu YoosephManolito Torralba

SDSC Team

Michael NormanMahidhar Tatineni Robert Sinkovits


Recommended