Post on 02-Jun-2020
transcript
Paul J. McMurdie!Research Associate!Prof Susan Holmes Group!Statistics Department!Stanford University
Shiny-phyloseq: Web Application for Interactive Microbiome Analysis
with Provenance Tracking
• Intro to Microbiome Research!
• phyloseq - a microbiome BioC package!
• (RNA-Seq methods solve a microbiome problem)!
• Shiny-phyloseq: a shiny interface to phyloseq
Overview
What are microbes? Cell structure
(they don’t all look like this)
What are microbes?
http://en.wikipedia.org/wiki/Tree_of_life_(biology)
Bacteria
Archaea
Eukaryota
Ancestry of Life
• A population of a single species/strain is a culture, extremely rare outside of lab, some infections!
• A microbiome is a mixed population of different microbial species (microbial ecosystem)
What is a microbiome?The totality of microbes in a defined environment, especially their genomes and interactions with each other and surrounding environment.
Cow Rumen
Human Microbiomes
Oceans, soils, waterwaysWastewater Treatment
Why study microbiomes?
Deep-Sea Hydrothermal Vent
Earth Microbiome Project:
Human Body Sites, HMP
>10 times more microbial cells than human cells !Entire human microbiome weighs less than 2 kg, at most
Fecal Transplants
Borody, et al (2011)!Nature Rev Gastroenterology &!Hepatology
(Clostridium difficile infection)
• Culture-based methods fail to detect most microbes!
• Microbes are easy to miss (except pathogens)!
• Most microbes are NOT pathogens (even the human-associated)
Bias for cultivable microbes, especially pathogens
• PCR, fast & cheap DNA sequencing, microarrays, etc!
• Discovery of culture-independent techniques - 16S-rRNA
Availability of tools limited to last 3 decades
Why is microbiome research new?
ribosome16S rRNA
ribosome!in action
How do we query microbiomes??
• Universal (e.g. 16S rRNA) Gene census!
• Shotgun Metagenome Sequencing!
• Transcriptomics (shotgun mRNA)!
• Proteomics (protein fragments)!
• Metabolomics (excreted chemicals)
Number of Microbial Species
Counted
How do we query microbiomes??
Paul J. McMurdie!!Statistics Department!& CEHG!Stanford University!!with contributions from!Prof Susan Holmes
Microbiome data !heterogeneity and processing
microbiome!samples
amplify 16S rRNA!(barcoded)
demultiplex and !species clustering
apepackage
OTU Abundanceotu_table
Sample Variablessample_data
Taxonomy TabletaxonomyTable
Phylogenetic Treephylo
otu_table sample_data tax_table phy_tree
otu_table sample_data tax_table
read.treeread.nexusread_tree
as as as
import
phyloseqconstructor:
Biostringspackage
Reference Seq.XStringSet
DNAStringSet RNAStringSet
AAStringSet
phyloseq
Experiment Data
otu_table,sam_data,tax_table,phy_treerefseq
Accessors:get_taxaget_samplesget_variablensamplesntaxarank_namessample_namessample_sumssample_variablestaxa_namestaxa_sums
Processors:filter_taxamerge_phyloseqmerge_samplesmerge_taxaprune_samplesprune_taxasubset_taxasubset_samplestip_glomtax_glom
matrix matrixdata.frame
optional
refseq
data
data structure & APIphyloseq
phyloseq
Preprocessing
Import
Direct Plots
plot_network plot_heatmap plot_ordination
distance ordinate
Summary / ExploratoryGraphics
filter_taxafilterfun_samplegenefilter_sampleprune_taxaprune_samplessubset_taxasubset_samplestransform_sample_counts
import_biomimport_mothurimport_pyrotaggerimport_qiimeimport_RDP
plot_tree
plot_richness
plot_bar
bootstrappermutation testsregressiondiscriminant analysismultiple testinggap statisticclusteringprocrustes
Inference, Testing
sample data
OTU cluster output
Input
raw
phyloseqprocessed
work flowphyloseq
graphics
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●●
●● ●
●●
●
●
●
●
●
●
●●
●
●●
●
●●●●
−0.3
−0.2
−0.1
0.0
0.1
0.2
0.3
−0.4 −0.2 0.0 0.2 0.4NMDS1
NMDS
2
SampleType●●●●●●●●●●●●●●●●●●
FecesFreshwaterFreshwater (creek)MockOceanSediment (estuary)SkinSoilTongue
plot_ordination, NMDS, wUF
FreshwaterFreshwater (creek)FreshwaterFreshwater (creek)Freshwater (creek)SoilSoilSoilSkinSkinSkinM
ockM
ockM
ockFecesFecesFecesFecesSedim
ent (estuary)TongueTongueO
ceanO
ceanO
ceanSedim
ent (estuary)Sedim
ent (estuary)
SampleType
OTU
1
100
10000
Abundance
plot_heatmap; bray−curtis, NMDS
●
●
●
●●
●
●●
●●●●
●
●
●●●●●
●
●
●
●
●
●
●●
●●●●●●
●
●
●
●
●
●●●
●
●●
●●●●●●●
●
●
●●
●
●
●
●●
●
●
●●●
●
●● ●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●● ●
●●
●
SeqTech●
●
●
IlluminaPyro454Sanger
Enterotype● 1
23
plot_network; Enterotype data, bray−curtis, max.dist=0.25
●●
●
●
●
● ●●
●●
●
●
●●●●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●●
●
●●
●
●
●
●
●●
●●
●
●
●●
●
●
●●
●●
●●
●
●
●
●●
●
●●
●●
●
●
●●
●●
●
●
●●
●●
●●
●●
●
●
●●
●●
●
●
●●
●●
●
●●
●
●
●●
● ●● ●
●●
●●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●●
● ●● ●
●
●●
●●
●
CytophagaEmticicia
Sphingobacterium
Segetibacter
Haliscomenobacter
Pedobacter
Bacteroides
Alistipes
Bacteroides
Cytophaga
Porphyromonas
Prevotella
Parabacteroides
Algoriphagus
Odoribacter
CandidatusAquirestis
Capnocytophaga
Porphyromonas
Spirosoma
Prevotella
Balneola
Prevotella
Hymenobacter
Prevotella
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
76
73
75
75
79
67
81
84
84
82
75
Abundance●
●
●
●
12562515625
SampleType●
●
●
●
●
●
●
●
●
FecesFreshwaterFreshwater (creek)MockOceanSediment (estuary)SkinSoilTongue
Order● Bacteroidales
FlavobacterialesSphingobacteriales
plot_tree; Bacteroidetes−only. Merged samples, tip_glom=0.1
0e+00
2e+05
4e+05
6e+05
Feces
Freshwater
Freshwater (creek)
Mock
Ocean
Sediment (estuary)
Skin
Soil
Tongue
SampleType
Abun
danc
e
FamilyBacteroidaceaeBalneolaceaeCryomorphaceaeCyclobacteriaceaeFlavobacteriaceaeFlexibacteraceaePorphyromonadaceaePrevotellaceaeRikenellaceaeSaprospiraceaeSphingobacteriaceae
plot_bar; Bacteroidetes−only
●
S.obs S.chao1 S.ACE
2000
4000
6000
8000
FALSE TRUE FALSE TRUE FALSE TRUEHuman Associated Samples
Num
ber o
f OTU
s
SampleTypeFecesFreshwaterFreshwater (creek)MockOceanSediment (estuary)SkinSoilTongue
plot_ordination()
plot_network()
plot_bar()
plot_heatmap()
plot_tree()
plot_richness()
phyloseq
http://joey711.github.io/waste-not-supplemental/
edgeR, DESeq(2), metagenomeSeq!perform better than popular alternatives!in differential abundance detection:!!McMurdie and Holmes (2014) PLoS Comp Biol!DOI: 10.1371/journal.pcbi.1003531
Side Note: BioC tools for microbiome
genes
samples
species
samples
species counts
gene counts
Acknowledgements
Susan Holmes
Wolfgang Huber
BioC and CRAN
Helpful advice and feedback re: DESeq(2)
Postdoc Advisor, Mentor, Co-author
Support, Feedback, Distribution of phyloseq and biom
RStudio Shiny, RStudio IDE
Hadley Wickham ggplot2, reshape2, plyr R packages
Holmes Group Helpful advice and feedback
Shiny-phyloseq
Live Demoinstall.packages(“shiny”)shiny::runGitHub(“shiny-phyloseq”, “joey711”)
How to Run:
http://joey711.github.io/shiny-phyloseq/
End.!Questions?