+ All Categories
Home > Documents > An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced...

An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced...

Date post: 01-Aug-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
25
1 An enhanced characterization of the human skin microbiome: a new biodiversity of 1 microbial interactions 2 3 Akintunde Emiola 1 , Wei Zhou 1 , Julia Oh 1 * 4 5 1 The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA 6 *Corresponding author. [email protected] 7 8 9 ABSTRACT 10 11 The healthy human skin microbiome is shaped by skin site physiology, individual-specific factors, 12 and is largely stable over time despite significant environmental perturbation. Studies identifying 13 these characteristics used shotgun metagenomic sequencing for high resolution reconstruction 14 of the bacteria, fungi, and viruses in the community. However, these conclusions were drawn from 15 a relatively small proportion of the total sequence reads analyzable by mapping to known 16 reference genomes. ‘Reference-free’ approaches, based on de novo assembly of reads into 17 genome fragments, are also limited in their ability to capture low abundance species, small 18 genomes, and to discriminate between more similar genomes. To account for the large fraction 19 of non-human unmapped reads on the skin—referred to as microbial ‘dark matter’—we used a 20 hybrid de novo and reference-based approach to annotate a metagenomic dataset of 698 healthy 21 human skin samples. This approach reduced the overall proportion of uncharacterized reads from 22 42% to 17%. With our refined characterization, we revisited assumptions about the skin 23 microbiome, and demonstrated higher biodiversity and lower stability, particularly in dry and moist 24 skin sites. To investigate hypotheses underlying stability, we examined growth dynamics and 25 interspecies interactions in these communities. Surprisingly, even though most skin sites were 26 relatively stable, many dominant skin microbes, including Cutibacterium acnes and staphylococci, 27 were actively growing in the skin, with poor or no relationship between growth rate and relative 28 abundance, suggesting that host selection or interspecies competition may be important factors 29 maintaining community homeostasis. To investigate other mechanisms facilitating adaptation to 30 a specific skin site, we identified Staphylococcus epidermidis genes that are likely involved in 31 stress response and provide mechanisms essential for growth in oily sites. Finally, horizontal gene 32 transfer—another mechanism of competition by which strains may swap antagonistic or virulent 33 coding regions—was relatively limited in healthy skin, but suggested exchange of different 34 metabolic and environmental tolerance pathways. Altogether, our findings underscore the value 35 of a combined reference-based and de novo approach to provide significant new insights into 36 microbial composition, physiology, and interspecies interactions to maintain community 37 homeostasis in the healthy human skin microbiome. 38 39 BACKGROUND 40 Deep metagenomic shotgun sequencing is a powerful tool to interrogate composition and function 41 of complex microbial communities. Microbial communities offer the potential for discovery of a 42 tremendous suite of previously unknown biological functions, for example, new bioactive 43 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820 doi: bioRxiv preprint
Transcript
Page 1: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

1

An enhanced characterization of the human skin microbiome: a new biodiversity of 1 microbial interactions 2

3 Akintunde Emiola1, Wei Zhou1, Julia Oh1* 4 5 1The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA 6 *Corresponding author. [email protected] 7 8 9 ABSTRACT 10 11 The healthy human skin microbiome is shaped by skin site physiology, individual-specific factors, 12 and is largely stable over time despite significant environmental perturbation. Studies identifying 13 these characteristics used shotgun metagenomic sequencing for high resolution reconstruction 14 of the bacteria, fungi, and viruses in the community. However, these conclusions were drawn from 15 a relatively small proportion of the total sequence reads analyzable by mapping to known 16 reference genomes. ‘Reference-free’ approaches, based on de novo assembly of reads into 17 genome fragments, are also limited in their ability to capture low abundance species, small 18 genomes, and to discriminate between more similar genomes. To account for the large fraction 19 of non-human unmapped reads on the skin—referred to as microbial ‘dark matter’—we used a 20 hybrid de novo and reference-based approach to annotate a metagenomic dataset of 698 healthy 21 human skin samples. This approach reduced the overall proportion of uncharacterized reads from 22 42% to 17%. With our refined characterization, we revisited assumptions about the skin 23 microbiome, and demonstrated higher biodiversity and lower stability, particularly in dry and moist 24 skin sites. To investigate hypotheses underlying stability, we examined growth dynamics and 25 interspecies interactions in these communities. Surprisingly, even though most skin sites were 26 relatively stable, many dominant skin microbes, including Cutibacterium acnes and staphylococci, 27 were actively growing in the skin, with poor or no relationship between growth rate and relative 28 abundance, suggesting that host selection or interspecies competition may be important factors 29 maintaining community homeostasis. To investigate other mechanisms facilitating adaptation to 30 a specific skin site, we identified Staphylococcus epidermidis genes that are likely involved in 31 stress response and provide mechanisms essential for growth in oily sites. Finally, horizontal gene 32 transfer—another mechanism of competition by which strains may swap antagonistic or virulent 33 coding regions—was relatively limited in healthy skin, but suggested exchange of different 34 metabolic and environmental tolerance pathways. Altogether, our findings underscore the value 35 of a combined reference-based and de novo approach to provide significant new insights into 36 microbial composition, physiology, and interspecies interactions to maintain community 37 homeostasis in the healthy human skin microbiome. 38

39 BACKGROUND 40 Deep metagenomic shotgun sequencing is a powerful tool to interrogate composition and function 41 of complex microbial communities. Microbial communities offer the potential for discovery of a 42 tremendous suite of previously unknown biological functions, for example, new bioactive 43

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 2: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

2

compounds, antimicrobials, virulence factors, or metabolic pathways. Such discovery has relied 44 on the ability to survey and deconvolute species from mixed microbial consortia. Advances in 45 next-generation sequencing and computational analyses have, in recent years, greatly furthered 46 efforts to reconstruct microbial communities at the species1,2, strain2,3, and even single nucleotide 47 polymorphism level3,4, examining function, transmission, and stability of the resident microbes. 48 49 However, interpretations of many metagenomic datasets are limited by the inability to characterize 50 a large fraction of the total microbial reads present in the original sample5,6. This uncharacterized 51 sequence space, or microbial ‘dark matter’7, typically results from the inability to map a sequence 52 read to a known microbial reference genome and can exceed 96% of sequence reads within a 53 sample5. Such ‘reference-based’ approaches, whether mapping reads to complete genomes8 or 54 marker genes9, have high sensitivity and discriminatory ability between even very similar 55 genomes8. However, microbes with no representative reference, or those with significant 56 pangenomic variation, which can account for considerable within-species diversity in gene 57 content10, are not captured. Conversely, reference-free approaches based on de novo assembly 58 to aggregate reads into longer stretches of contiguous DNA sequence, can aid in the identification 59 and characterization of new genomes. However, de novo assembly-based approaches are less 60 effective in capturing small genomes (e.g., viruses), low-abundance microbes, and in 61 discriminating between very similar genomes. 62 63 By combining both approaches into a holistic framework, we aimed to reduce the proportion of 64 uncharacterized sequence space in a metagenomics dataset, and thus provide new insights into 65 the biological function and interspecies interactions of these microbial communities. We used a 66 hybrid de novo and reference-based approach aimed at characterizing microbial dark matter in 67 the skin metagenome. Our previous analyses of this dataset (698 samples), which were 68 exclusively reference-based, showed that the skin microbiome is defined primarily by the 69 physiological characteristics of the skin site (e.g., whether it was a sebaceous, moist, dry, or foot 70 site), then by host-intrinsic factors that confer individuality in strain representation and the 71 presence of low-abundance and transient organisms5,6. More intriguing was the observation that 72 the skin microbiome is remarkably stable even over years, despite the exposure of skin to different 73 hygiene practices and the external environment6. However, our conclusions were based on an 74 incomplete portrait with, on average, half of each sample remaining uncharacterized by our 75 reference-based analyses5. By incorporating additional information from microbial dark matter, 76 we stood to gain significant new insights into the landscape of skin biodiversity and microbial 77 stability. 78 79 Leveraging our integrated approach, we uncovered previously unaccounted-for biodiversity and 80 reduced microbial stability in the skin microbiome. We used this refined characterization to more 81 deeply probe interspecies interactions, identifying intra-genus diversity and mechanisms 82 underlying stability and inter-species interactions in the skin, including new assessments on 83 growth rate and horizontal gene transfer. Our results demonstrate the highest resolution analysis 84 of the skin microbiome to date, and provide new hypotheses for how skin microbes interact and 85 compete to maintain homeostatic community conditions. 86 87

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 3: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

3

RESULTS 88 A hybrid de novo and reference-based microbial community analysis 89 To address the significant uncharacterized sequence space (mean ± sd 42% ± 24%) in our initial 90 analysis of a 698-sample longitudinal skin metagenomic dataset (Supplementary Fig. 1), we used 91 reference-independent approaches to reconstruct composition. With the improvement of de novo 92 assembly algorithms to input large datasets11, we concatenated our samples and assembled 93 iteratively, resulting in 75% ± 19% reads incorporated into the assemblies (Supplementary Fig. 94 2). The 1,037,465 resultant contigs >1kb were then grouped into genome ‘bins’ based on co-95 abundance clustering and nucleotide composition. However, because this approach is limited in 96 its ability to recover small genomes, low-abundance species, and to ascertain precise taxonomic 97 classifications, we speculated that integrating reference-based analyses (Fig. 1A) would further 98 reduce dark matter beyond the 33% ± 21% reduction observed by mapping reads to our de novo 99 reference set only (Fig. 1B). Microbial reads unmappable to our de novo reference catalogue 100 were annotated by mapping to a reference database of fungal, bacterial, and viral genomes. 101 Reference-based and de novo classifications were integrated with a normalization step that took 102 into consideration the total proportion of reads derived from each approach (Supplementary Fig. 103 2). While using de novo references significantly aided reconstruction of microbiota, our hybrid 104 approach most considerably reduced the proportion of dark matter (16% ± 17%; Fig. 1B and C). 105 106 A new biodiversity of the human skin metagenome 107 Our new compositional analysis was largely concordant with our previous findings that the skin 108 microbiome is predominated by Staphylococcus, Cutibacterium (formerly Propionibacterium), 109 Corynebacterium, and Malassezia species5,6 (Fig. 1C, Fig. 1D). However, we uncovered 110 considerably more diversity of Staphylococcus, Corynebacterium, Proteobacteria, and fungal 111 genomes at most skin sites (Fig. 1C, Supplementary Fig. 3A). For example, we identified a 112 previously uncharacterized, but abundant, Lactobacillales colonizing the external auditory canal 113 (Ea) as Alloiococcus otitis. 9% ± 13% of reads mapped to bins were unclassifiable based on 114 BLASTn alignment to the NCBI nt database. These likely represented contigs that belong either 115 to uncharacterized genomes or undiscovered pan-genomic variation of a lower abundance strain 116 that could not be binned with its species unit. These otherwise unclassifiable bins were most 117 frequently bacterial, underscoring that the majority of undiscovered biodiversity in the skin is not 118 fungal or viral (Supplementary Fig. 3B). 119 120 Our revised classification showed lower representation of C. acnes than previous analyses, and 121 markedly lower Propionibacterium phage representation in sebaceous regions. De novo-only 122 approach did not capture viral contigs; these were most accurately recovered with the combined 123 use of reference genomes, for example in the alar crease (Al) (Fig. 1C). 124 125 With our integrated classification, we re-examined our conclusions of diversity and stability at the 126 community level. All skin sites except the ear and foot had higher diversity than previously 127 reported, which is expected given the identification and inclusion of more genomes/genome bins 128 than in our original analyses (Fig. 1E). Because resolved dark matter only represented a few 129 genomes in the external auditory canal (Ea) (mostly Lactobacillales) and foot sites (mostly 130 Staphylococcus and Corynebacterium), diversity was unchanged in these regions. Since previous 131

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 4: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

4

analyses correlated increased diversity with decreased stability6, we re-evaluated our conclusions 132 on stability over the ~month (T2-T3) and ~year intervals (T1-T2) collected in this study. Most sites 133 were less stable than originally defined (Fig. 1F)—for example, the hand and forearm (dry) sites, 134 which are highly exposed to the external environment (Fig. 1F, Supplementary Fig. 4A). This may 135 be due to behavioral patterns like hand-washing or increased acquisition of transient, 136 environmental microbes. Longitudinal tracking of individual species’ dynamics over time showed 137 that community stability is driven by specific microbes (Supplementary Fig. 4B). For example, 138 newly identified Corynebacterium species (e.g., jeikeium) were associated with stability whereas 139 staphylococci showed more fluctuation over time. 140 141 Mechanisms underlying interspecies interactions: growth dynamics of skin microbiota 142 A significant limitation of previous skin microbiome studies is the absence of information on 143 microbial activity, which is necessary to understand potential underlying homeostatic mechanisms 144 as microbes are unlikely to be completely inactive in the skin. We reasoned that the contributions 145 of viable populations to overall microbial abundance and functional community potential could be 146 inferred by assessing bacterial growth rate from the skin metagenome. This would also allow us 147 to estimate the ratio of rapidly growing to dead/stationary cells of common skin microbes, which 148 would provide additional insights into antagonistic interspecies interactions. 149 150 To achieve this, we used the peak-to-trough ratio (PTR) method12, implemented in GRiD13, which 151 maps metagenomic reads to a bacterial reference genome to calculate coverage drops across 152 the genome. Because most bacteria harbor a single circular chromosome replicated bi-153 directionally from the origin of replication (ori) to the terminus (ter) region14, rapidly growing cells 154 will have a higher coverage at ori vs ter. 155 156 First, we generalized our analysis to define the steady-state growth dynamics of dominant skin 157 microbes. Defining a microbe with a GRiD score > 1 as being in exponential phase, we identified 158 the proportion of bacteria that were actively growing across different skin sites. Most examined 159 microbes in dry sites (i.e., palm and forearm) were most active (Fig. 2A), interesting because 160 these regions are often perturbed (e.g., by hand washing) and biomass is low. Such factors could 161 affect microbial viability and growth rate to replenish the endogenous community. In contrast, 162 sebaceous sites, which typically harbors higher biomass, least supported rapid growth, which 163 could reflect the relatively specialized physiologic growth conditions, including anoxia, which 164 would limit or slow growth of many microbes15. Foot sites favored growth of S. epidermidis and 165 other non-lipophilic microbes at the expense of sebum-metabolizing C. acnes, which was least 166 active at that site (Fig. 2A). Strikingly, growth rates of these microbes were stable over multiple 167 timepoints (Fig. 2B). 168 169 We also asked if increased growth rate resulted in increased relative abundance. Relative 170 abundance was positively correlated with growth rate only for certain species (Fig. 2C). For 171 example, C. ureicelerivorans’ relative abundance was strongly correlated with growth rate in all 172 skin sites (Fig. 2C). This suggests that for select species, microbial abundance may be controlled 173 by how rapidly cells are dividing. In contrast, C. acnes relative abundance and growth rate were 174 strongly anti-correlated in sebaceous sites (Fig. 2C), suggesting that C. acnes’ growth rate may 175

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 5: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

5

be modulated based on the presence/absence of competition or is regulated by its quorum 176 sensing mechanisms. 177 178 Finally, we examined the microbial growth dynamics in patients with primary immunodeficiency 179 syndrome, characterized with eczematous lesions5, as rapid growth might identify potential 180 pathogens. Common skin commensals such as S. epidermidis and C. ureicelerivorans had 181 significantly decreased growth rate in these patients (Fig. 2D). The decreased growth rate of S. 182 epidermidis is surprising because these patients are unusually prone to staphylococcal infections. 183 This observation may be due to the lack of correlation between growth rate and abundance for S. 184 epidermidis in most skin sites (Fig. 2C). 185 186 Structural variants of Staphylococcus epidermidis 187 Community homeostasis is maintained by both host factors, interspecies interactions, and 188 acquisition of genes, the latter of which can play important role in modulating adaptability to a 189 particular environmental niche. These genes can encode for proteins that can influence signaling, 190 virulence, and antimicrobial properties16,17. Consequently, we examined microbial structural 191 variants that may potentially harbor genes that play a role in strain adaptability to a specific skin 192 site. We focused our analysis on S. epidermidis because unlike other Staphylococci, it thrives well 193 in multiple skin sites (Fig. 2A). 194 195 We retrieved pangenome sequences from panDB, a database which assembles non-redundant 196 genomic regions from multiple sequenced strains18, split sequences into 1 Kb fragments, and 197 determined the enrichment of fragments across samples. Because Staphylococcus other than S. 198 epidermidis thrives mostly in foot sites, and less in sebaceous regions (Fig. 2A), we investigated 199 structural variants that might be associated with S. epidermidis adaptability in sebum-rich sites. 200 We identified 6 S. epidermidis fragments that were always enriched in sebaceous sites when 201 compared with other sites (Fig. 3A) which suggests that these may harbor genes essential for 202 survival in sebaceous regions. 203 204 In addition, we hypothesized that if these fragments are indeed associated with adaptability, 205 homologues in S. capitis, a closely related genome, will also be associated with adaptability in 206 the latter. Similar to S. epidermidis, we identified 11 S. capitis fragments that were always 207 differentially enriched in sebaceous sites (Fig. 3A). Interesting, 4 of the 7 genes harbored in S. 208 epidermidis fragments had homologues in S. capitis (Fig. 3B), suggesting an underlying 209 importance in sebaceous sites. 210 211 Next, we examined if these candidate genes could be located in mobile genetic elements which 212 may suggest inter/intra- species transmission. Using BLASTn alignment to the NCBI nt database, 213 all candidate genes could be identified in previously annotated plasmids or phages (Fig. 3C). We 214 conducted additional benchmark analysis to minimize false positives by determining the 215 correlation between candidate genes and S. epidermidis abundance or growth rate. 216 Unsurprisingly, all fragments containing genes with homologues in S. capitis were positively 217 correlated with both abundance and growth rate in sebaceous sites. Most of these genes encode 218 for hypothetical proteins; however, a candidate (S_epi_13619) encodes for a stress response 219

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 6: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

6

protein suggesting that mechanisms to adapt to otherwise unfavorable conditions may play a 220 important role in survival. 221 222 Indirect interactions: horizontally-transferred genes in skin microbes 223 To further investigate the genetic basis of interspecies interactions using our refined metagenomic 224 analysis, we investigated horizontal gene transfer (HGT). HGT is a mechanism by which a 225 microbe can acquire genetic material that may confer an increased survival or competitive 226 advantage within a community. 227 228 We developed a HGT prediction pipeline (Supplementary Fig. 5) and based on the simulated 229 dataset, the pipeline was able to identify 31% - 51% of the simulated HGT genes. We found that 230 the prediction sensitivity was confined by the sensitivity of the metagenomic assembler (i.e., 231 whether a gene was fully assembled) and the sensitivity of the gene predictor (i.e., whether an 232 open reading frame was correctly annotated as a gene) (Supplementary Fig. 6), but not the 233 synonymous distance-based algorithm presented in this study. The predicted HGT genes 234 exhibited a large variation of synonymous distances, representing both recent HGT events and 235 more ancient HGT events during the diversification of the microbial species (Supplementary Fig. 236 6). Importantly, the HGT genes with the lowest synonymous distances (synonymous distance < 237 0.1), which correspond to the most recent HGT events, almost exclusively matched the simulated 238 HGT genes (94% - 100%, Supplementary Fig. 6), demonstrating the high specificity of the 239 prediction pipeline. 240 241 To predict HGT among microbes within the skin microbiome, we developed a pipeline using our 242 existing metagenomic data. In each pair of assembled genomes, we identified HGT candidates 243 by looking for pairs of genes that are significantly more similar than immobile genes (Fig. 4A, 244 Supplementary Fig. 5). Our HGT identification pipeline is a parametric version of a previously 245 described robust method that searched for identical or near-identical gene pairs in distantly 246 related microbial genomes19. Consistent with previous reports, functional annotation of HGT 247 candidates showed a distribution over a wide functional spectrum19 (Fig. 4B). The most 248 overrepresented functionality of the predicted HGT candidates were the transporters, highlighting 249 the potential of the microbes to acquire the ability to uptake environmental nutrients and extrude 250 harmful molecules through HGT events (Fig. 4B). Although most types of transporters were 251 uniformly distributed in the mobile gene pool, microbiome at the sebaceous sites demonstrate 252 enrichment of transporters that are involved in transporting metallic cation (Fig. 4B and 4C), 253 including iron, manganese, zinc, cobalt, nickel, and biotin. These results suggest the existence of 254 a pool of (conditionally) mobile genes exerting a multitude of biological functions, with enrichment 255 of specific functions observed at specific skin types. 256 257 Finally, we constructed a network to reflect the top HGT events among microbial species in each 258 type of skin sites (Figure 4D). Across skin sites, HGT were identified as a function of abundant 259 species. For example, HGT candidates identified at a sebaceous site predominantly came from 260 Actinobacteria, including bacteria in the genera Propionibacterium, Corynebacterium and 261 Staphylococcus (Fig. 4D). Due to its dominance in many skin sites, C. acnes was central in 262 networks corresponding to all types of skin types except for toenails, in which Actinobacteria (C. 263

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 7: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

7

acnes, Corynebacterium singulare, Micrococcus luteus, and Kocuria rhizophila) and multiple 264 Staphylococcus species formed disconnected networks, strongly suggesting that HGT events at 265 a body site were driven by the microbiome composition at the site. Overall, this pipeline 266 characterizes statistically likely HGT candidates directly from shotgun metagenomic data, and is 267 useful to simultaneously estimate the functional and taxonomic distribution of the mobile genes 268 between genera. 269 270 DISCUSSION 271 The human skin, our largest organ and first line of defense, is home to a diversity of 272 microorganisms. These microbes play an essential role in influencing metabolic processes, 273 immune system modulation, and antagonism of potentially transient pathogens15. Alterations in 274 community composition have been associated with a number of skin diseases like atopic 275 dermatitis, psoriasis, and eczema20,21. Numerous host-intrinsic (e.g., genetics, immune 276 competence, skin barrier conditions), environmental or lifestyle factors (e.g., hygiene, exposure 277 to different microbes), as well as microbiome-intrinsic factors affect disease predilection. A deep 278 understanding of these factors, as well as the ecological complexity of the skin’s microbiota, is 279 needed to understand factors that influence its homeostasis and predisposition to disease. 280 281 Large-scale studies have aimed to characterize the skin microbiota using deep shotgun 282 metagenomic sequencing, yet conclusions drawn from those previous analyses were limited in 283 that a majority of sequence reads could not be mapped to any known genome5,6. To solve this 284 limitation, we used an integrated approach that incorporated de novo assembly and binning with 285 reference-based analyses to identify and quantify new microbial skin occupants. This dual 286 approach provided a key improvement in the resolution of the community. While we observed a 287 significant reduction in uncharacterized sequence space using de novo approaches, viral 288 genomes and low abundance genomes were poorly captured but could be resolved with reference 289 genomes. 290 291 High-level analyses of the defining characteristics of the skin microbiome were largely concordant 292 with previous findings. However, our hybrid approach provided important new insights into skin 293 community structure, including: increased diversity, reduced stability across many sites, 294 dominance of previously uncharacterized microbes in certain skin sites like the external auditory 295 canal, and reduced representation of phage than previously believed. With few exceptions, we 296 found that previous reports overestimated stability of the skin microbiome, likely because they 297 lacked deeper classification of additional genomes, particularly for staphylococci and 298 cornyeforms. Subsequently, we focused our analysis—continuing to interleave reference-based 299 and de novo approaches—to examine potential factors that could underlie community stability 300 and homeostasis, including the growth rate of different community members and interspecies 301 interactions. 302 303 Leveraging metagenomic data to predict growth rate of dominant skin species, we measured 304 marked variance in which species were actively growing in the skin, and how skin site could affect 305 activity. For example, most microbes in dry sites were actively dividing compared to relatively few 306 in sebaceous sites. Moreover, S. epidermidis appeared to grow exponentially in all sites, at similar 307

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 8: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

8

growth rates, suggesting that the skin environment generally provides adequate nutrients to 308 support its rapid growth. Alternatively, S. epidermidis strains may have acquired specific genes 309 modulating adaptability to each site. For example, we identified numerous genes present in 310 mobile genetic elements that were associated with adaptability to sebaceous sites. Yet, strikingly, 311 its ultimate relative abundance was not correlated with growth rate in most skin sites, suggesting 312 an equally rapid killing or cell death. In this case, the lack of correlation between growth rate and 313 relative abundance indicated that competitiveness within the community is largely independent of 314 growth rate. Other skin sites/species showed a different relationship between growth rate and 315 relative abundance. A positive correlation suggests that strains out-compete the rest of the 316 community during exponential growth, as in the case of C. ureicelerivorans. Conversely, a 317 negative correlation may reflect an active quorum sensing mechanism involved in the regulation 318 of growth rate. For example, we observed a negative correlation for C. acnes that is restricted to 319 sebaceous skin, suggesting that the microenvironment may play a role in regulating growth rate. 320 321 In addition, we observed that several types of transporters, especially the metal transporters, were 322 highly abundant in the HGT gene pools at the sebaceous sites, potentially reflecting the 323 importance to transport a vast variety of lipid soluble metals diffusing through the permeable cells 324 of the sebaceous glands and follicular walls – a major absorption pathway of metals, including 325 metallic toxicants. Additionally, the enrichment of metal transporters at the sebaceous sites 326 paralleled the over-representation of Actinobacteria species at those skin sites, raising the 327 possibility that the dissemination of metal transporting ability among Actinobacteria microbes at 328 the sebaceous sites may be important to metal balance and consequently the health of the host. 329 330 In conclusion, we present a new landscape of the skin microbiome, providing the highest 331 resolution reconstruction of microbial community composition and biodiversity in the skin to date. 332 Importantly, we have used new approaches and analyses in a multifaceted investigation of 333 functional elements and interspecies interactions underlying stability and community 334 homeostasis. Our findings have generated testable hypotheses to interrogate interspecies and/or 335 interstrain inhibition. Finally, our analyses are broadly applicable to investigate these mechanisms 336 in skin disease, which ultimately can provide clues as to sources of antimicrobials directed 337 towards strain-specific pathogens. 338 339 340 METHODS 341 Sample datasets 342 We retrieved 698 metagenomic shotgun skin samples from our previous work5,6, which have been 343 quality filtered for the presence of human DNA. The majority of the samples (n = 594) in these 344 dataset were derived from longitudinal sampling of 12 individuals at 3 different time points with 345 sampling intervals of 10-30 months (“long”) and 5-10 weeks (“short”). 24 samples from this 346 collection were also derived from 2 individuals with hyper IgE syndrome. The remaining samples 347 represent a single timepoint from three additional healthy individuals. 348 349 Taxonomic classification of skin microbes 350 To classify skin microbes, we used a hybrid de novo-based and referenced-based 351

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 9: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

9

characterization technique (Supplementary Fig. 2). All samples were concatenated and sequence 352 reads were assembled into contigs using MEGAHIT 1.0.5 (--k-min 37 --k-max 67 --k-step 10 -m 353 0.99 --kmin-1pass --continue)11, which was used for its ability to handle large datasets with 354 relatively low memory requirement and short run time. We discarded contigs < 1Kb, mapped each 355 individual sample’s reads back to the contig catalog using bowtie2 2.2.8 (--sensitive)22, and 356 extracted unmapped reads. However, read coverage from our derived contigs catalogue was 357 relatively low (Supplementary Fig. 2). To re-assemble unmapped reads, we concatenated a 358 subset of unmapped reads from randomly selected samples (due to the high memory requirement 359 of SPAdes23) and re-performed previous steps using SPAdes 3.7.1 (--meta) for de novo 360 assembly, which was better able to resolve scaffolds from contigs. Newly extracted 361 contigs/scaffolds were merged with the previous catalog to produce a non-redundant contigs 362 catalog. We repeated the mapping of reads to the catalog to retrieve unmapped reads, randomly 363 concatenated a small subset of unmapped reads, and performed assembly using SPAdes. We 364 repeated this iterative step until no additional improvement in reads coverage from our 365 contigs/scaffolds catalog was observed (Supplementary Fig. 2), obtaining a total of 1,037,465 366 contigs/scaffolds > 1Kb. 367 368 Contigs/scaffolds were then grouped into genome bins using MetaBAT (--sensitive, -m 1500)24, 369 which resulted in 556 bins. Bins were taxonomically classified using MEGAN 4.70.425 370 (Supplementary Table 1). We excluded 22 bins which were of non-microbial origin and further 371 evaluated the quality of each bin using CheckM 1.0.626. For stringent annotation, we required that 372 ≥ 65% of contigs/scaffolds present in a bin were assigned to the lowest level taxonomy; the sole 373 exception being the kingdom-level taxonomy where our requirement was 40%. Bins were labeled 374 “uncharacterized” if they were unable to be assigned to at least a kingdom, although by further 375 relaxing our requirements to exclude contigs/scaffolds with no hits enabled characterization of 376 those bins to at least the kingdom level (Supplementary Fig. 3B). 377 378 Each sample was then mapped back to the genome bins using bowtie2. Unmapped reads were 379 subsequently characterized by a reference-based approach using reprDB18 and assigned to a 380 genome using Pathoscope 2.08. Reads unassignable by either method were categorized as “dark 381 matter”. 382 383 Bacterial growth rate estimation from skin samples 384 We estimated bacterial growth rate using GRiD (v1.3)13 using a coverage cutoff of 0.2. We created 385 custom GRiD database using species of C. acnes, S. epidermidis, S. aureus, C. jeikeium, and C. 386 ureicelerivorans. 387 388 Structural variant analysis 389 We retrieved pangenome sequences for S. epidermidis and S. capitis from panDB18, split 390 genomes into 1 Kb fragments, and predicted genes using prokka27. We mapped reads using 391 bowtie2 to the fragment pool of each species, estimated reads count across samples using 392 featureCounts28, and DESeq229 to infer genes differentially enriched between skin sites. We 393 filtered candidates using a adjusted p-value of 0.05 and log2 fold change of 1. We aligned genes 394 from S. epidermidis and S. capitis to generate a phylogenetic tree using MAFFT30. 395

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 10: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

10

396 Horizontal gene transfer prediction 397 By definition, HGT genes that transferred between two lineages have shorter evolutionary 398 distance than immobile genes (those that diverged at the time of divergence of the lineages). 399 Therefore, HGT genes tend to be more similar19. For a pair of genomes, one reliable method of 400 identifying HGT genes is to search for pairs of genes that are more similar to each other than the 401 immobile genes in the genomes that reflect the evolutionary distance of the genomes. Genomes 402 assembled from metagenomic shotgun data are inevitably incomplete and inaccurate, especially 403 at strain level. Therefore, we assembled genomes at species level and proceeded only with 404 genomes that have at least 25% completeness. To do this, metagenomic shotgun reads were first 405 assembled using MEGAHIT. The resulting contigs were then assigned a taxonomic label using 406 Kraken v0.10.631; all contigs assigned to the same species were combined to represent the 407 species draft genome. Genes were predicted from each species draft genome using prodigal32, 408 and completeness of the assemblies assessed using BUSCO v233. 409 410 We identified potential HGT genes in each pair of species genomes. To do this, we first assembled 411 a set of immobile genes from the genome pair to compute a null distribution of sequence similarity. 412 Immobile genes were identified using the bacteria-specific universal single-copy orthologs 413 (USCOs) annotated by BUSCO. Because USCOs are universally present across bacterial 414 lineages and exist only in single copy, their horizontal transfer is unlikely. Second, all gene 415 sequences from a pair of species genomes were clustered using uclust34 at 0.5 similarity cut-off 416 to reduce complexity. If any pair of genes within a cluster and from different species genomes has 417 a significantly higher similarity than the immobile genes, the gene pair is identified as horizontally 418 transferred. To mitigate the effect of natural selection, we computed synonymous distance—the 419 number of synonymous changes per synonymous site—as the test statistics for similarity. Each 420 pair of protein-coding genes was first reverse-aligned using the seqinr package35, after which 421 synonymous distance was computed using PAML36, which implements an ad hoc method that 422 corrects for codon frequency bias. Finally, to further lessen the influence of purifying selection, 423 we removed HGT candidates that represented essential genes (i.e., genes with positive hits to 424 the DEG 10 database37 using ublast34 with e-value < 10-9) and ribosome genes (ie, genes 425 corresponding to KEGG BRITE ko03009 and ko03011). 426 427 We applied the prediction pipeline to three simulated microbial communities for validation. The 428 simulated communities was generated using HgtSIM38. Briefly, each community included three 429 common skin bacteria species: C. acnes, S. epidermidis, and Streptococcus mitis. Each species 430 was represented by three sequenced strain genomes, including one RefSeq representative strain 431 and two other strains from RefSeq. All 9 strains were mixed at equal abundances in each 432 simulated community. From the representative strain, 5 genes were randomly selected and 433 horizontally transferred to all other strains in the community (a total of 15 HGT genes in each 434 community). The three microbial communities differ in the amount of mutations accumulated in 435 the HGT genes: the HGT genes were allowed to accumulate 0, 5% and 10% mutated bases in 436 the recipient strains for the three communities, respectively. 3 million paired-end metagenomic 437 shotgun reads were sampled from each community as part of the HgtSIM pipeline, and 438 subsequently processed using the HGT prediction pipeline described above. 439

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 11: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

11

440 Statistics 441 We conducted all statistical analyses using R software. Spearman correlation was utilized for all 442 correlation coefficient analyses while statistical differences between population groups were 443 determined using the Wilcoxon rank-sum test. Where multiple measurements (e.g., timepoint, 444 skin site within an individual) were used for correlation analyses, partial Spearman correlation 445 was used, adjusting for multiple measurements. To assess microbial stability over time, we 446 utilized the Yue-Clayton theta index, which calculates the distance between communities based 447 on relative abundance of species in the population39. Community diversity was determined using 448 the Shannon diversity index, which measures both species richness and evenness. 449 450 DECLARATIONS 451 Ethics approval and consent to participate. Not applicable. 452 Consent for publication. All authors have approved submission of this manuscript. 453 Availability of data and material. The data used in this analysis is available in SRA under 454 Bioproject 46333. 455 Competing interests. All authors declare that they have no competing interests. 456

Funding. This work was funded by the National Institute of Health (DP2 GM126893-01 and K22 457 AI119231-01). JO is additionally supported by the National Institutes of Health (1U54NS105539, 458 1 U19 AI142733, 1 R21 AR075174, 1 R43 AR073562), the Department of Defense 459 (W81XWH1810229), the National Science Foundation (1853071), the American Cancer Society, 460 and Leo Foundation. 461

Authors' contributions. AE and JO conceived the project. WZ contributed scripts and analysis. 462 AE and JO analyzed data. AE and JO wrote the manuscript. 463

Acknowledgements. We would like to thank the Oh lab for critical reading of the manuscript. 464

465 466 REFERENCES 467

1. Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in 468 complex metagenomic samples without using reference genomes. Nat Biotechnol. 32, 469 822-828 (2014). 470

2. Cleary, B. et al. Detection of low-abundance bacterial strains in metagenomic datasets by 471 eigengenome partitioning. Nat Biotechnol. 33, 1053-1060 (2015). 472

3. Luo, C. et al. ConStrains identifies microbial strains in metagenomic datasets. Nat 473 Biotechnol. 33, 1045-1052 (2015). 474

4. Tsai, Y. C. et al. Resolving the Complexity of Human Skin Metagenomes Using Single-475 Molecule Sequencing. MBio. 7, e01948-01915 (2016). 476

5. Oh, J. et al. Biogeography and individuality shape function in the human skin 477 metagenome. Nature. 514, 59-64 (2014). 478

6. Oh, J. et al. Temporal Stability of the Human Skin Microbiome. Cell. 165, 854-866 (2016). 479 7. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. 480

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 12: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

12

Nature. 499, 431-437 (2013). 481 8. Hong, C. et al. PathoScope 2.0: a complete computational framework for strain 482

identification in environmental or clinical sequencing samples. Microbiome. 2, 33 (2014). 483 9. Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific 484

marker genes. Nat Methods. 9, 811-814 (2012). 485 10. Tettelin, H. et al. Genome analysis of multiple pathogenic isolates of Streptococcus 486

agalactiae: implications for the microbial "pan-genome". Proc Natl Acad Sci U S A. 102, 487 13950-13955 (2005). 488

11. Li, D., Liu, C. M., Luo, R., Sadakane, K., Lam, T. W. MEGAHIT: an ultra-fast single-node 489 solution for large and complex metagenomics assembly via succinct de Bruijn graph. 490 Bioinformatics. 31, 1674-1676 (2015). 491

12. Korem, T. et al. Growth dynamics of gut microbiota in health and disease inferred from 492 single metagenomic samples. Science. 349, 1101-1106 (2015). 493

13. Emiola, A., Oh, J. High throughput in situ metagenomic measurement of bacterial 494 replication at ultra-low sequencing coverage. Nature Communications. 9, 4956 (2018). 495

14. Wang, J. D., Levin, P. A. Metabolism, cell growth and the bacterial cell cycle. Nat Rev 496 Microbiol. 7, 822-827 (2009). 497

15. Grice, E. A., Segre, J. A. The skin microbiome. Nat Rev Microbiol. 9, 244-253 (2011). 498 16. Sharon, G. et al. Specialized metabolites from the microbiome in health and disease. Cell 499

Metab. 20, 719-730 (2014). 500 17. Donia, M.S., Fischbach, M.A. HUMAN MICROBIOTA. Small molecules from the human 501

microbiota. Science. 349, 1254766 (2015). 502 18. Zhou, W., Gay, N., Oh, J. ReprDB and panDB: minimalist databases with maximal 503

microbial representation. Microbiome. 6(1),15 (2018). 504 19. Brito, I. L. et al. Mobile genes in the human microbiome are structured from global to 505

individual scales. Nature. 535, 435-439 (2016). 506 20. Zeeuwen, P. L., Kleerebezem, M., Timmerman, H. M., Schalkwijk, J. Microbiome and skin 507

diseases. Curr Opin Allergy Clin Immunol. 13, 514-520 (2013). 508 21. Oh, J. et al. The altered landscape of the human skin microbiome in patients with primary 509

immunodeficiencies. Genome Res. 23, 2103-2114 (2013). 510 22. Langmead, B., Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat Methods. 511

9, 357-359 (2012). 512 23. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to 513

single-cell sequencing. J Comput Biol. 19, 455-477 (2012). 514 24. Kang, D. D., Froula, J., Egan, R., Wang, Z. MetaBAT, an efficient tool for accurately 515

reconstructing single genomes from complex microbial communities. PeerJ. 3, e1165 516 (2015). 517

25. Huson, D. H., Auch, A. F., Qi, J., Schuster, S. C. MEGAN analysis of metagenomic data. 518 Genome Res. 17, 377-386 (2007). 519

26. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P., Tyson, G. W. CheckM: 520 assessing the quality of microbial genomes recovered from isolates, single cells, and 521 metagenomes. Genome Res. 25, 1043-1055 (2015). 522

27. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30, 2068-2069 523 (2014). 524

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 13: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

13

28. Liao, Y., Smyth, G.K., Shi, W. featureCounts: an efficient general purpose program for 525 assigning sequence reads to genomic features. Bioinformatics. 30, 923-930 (2013). 526

29. Love, M.I., Huber, W., Anders, S. Moderated estimation of fold change and dispersion for 527 RNA-seq data with DESeq2. Genome biology. 15, p.550 (2014). 528

30. Katoh, K., Standley, D. M. MAFFT multiple sequence alignment software version 7: 529 improvements in performance and usability. Mol Biol Evol. 30, 772-780 (2013). 530

31. Wood, D.E., Salzberg, S.L. Kraken: ultrafast metagenomic sequence classification using 531 exact alignments. Genome biology 15(3) p.R46 (2014). 532

32. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site 533 identification. BMC Bioinformatics. 11, 119 (2010). 534

33. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V., Zdobnov, E. M. 535 BUSCO: assessing genome assembly and annotation completeness with single-copy 536 orthologs. Bioinformatics. 31, 3210-3212 (2015). 537

34. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. 538 Bioinformatics. 26, 2460-2461 (2010). 539

35. Charif, D., Lobry, J. R. in Structural Approaches to Sequence Evolution: Molecules, 540 Networks, Populations (Bastolla, U., Porto, M., Roman, H. E., Vendruscolo, M. 541 eds.).(Springer, 2007). 542

36. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 24, 1586-543 1591 (2007). 544

37. Luo, H., Lin, Y., Gao, F., Zhang, C.T., Zhang, R. DEG 10, an update of the database of 545 essential genes that includes both protein-coding genes and noncoding genomic 546 elements. Nucleic acids research 42(D1), pp.D574-D580 (2013). 547

38. Song, W., Steensen, K., Thomas, T. HgtSIM: a simulator for horizontal gene transfer 548 (HGT) in microbial communities. PeerJ 5, p.e4015 (2017). 549

39. Yue, J.C., Clayton, M.K. A similarity measure based on species proportions. Commun. 550 Stat. A-Theor. 34, 2123–2131 (2005). 551

552 FIGURE LEGENDS 553 Figure 1. Hybrid de novo and reference-based approach resolves dark matter in skin 554 metagenome. 555 (A) Simplified flowchart of the hybrid de novo and reference-based characterization. (B) Boxplots 556 show the fraction of uncharacterized sequences when using reference database from Oh et al.6, 557 de novo genome bins only, or hybrid approach. Black lines indicate median; boxes first and third 558 quartiles. (C) Microbial relative abundance across skin sites from a representative individual using 559 the different classification approaches. (D) Heatmap shows microbial relative abundance across 560 all samples from hybrid de novo and reference-based characterization, segregated by skin site 561 characteristic. Darker colors indicate higher relative abundance. (E) Community diversity using 562 Shannon diversity index. * = p-value < 0.05, ** = p-value < 0.01, *** = p-value < 0.001, NS = not 563 significant by Wilcoxon-rank sum test. (F) Estimation of community stability using Yue-Clayton 564 theta index, where q~1 represents a completely stable community. “Long” refers to sampling time 565 interval between T1 and T2 while “Short” represents short sampling time interval between T2 to 566 T3. * = p-value < 0.05, ** = p-value < 0.01, *** = p-value < 0.001, NS = not significant by Wilcoxon-567 rank sum test. 568 569

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 14: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

14

Figure 2. Growth dynamics of skin bacteria. 570 Growth rate was calculated using GRiD13. (A) Proportion of common skin microbes in exponential 571 vs. stationary growth phase/absent across different skin sites. GRiD = 1 for a microbe was 572 considered stationary phase. C. acnes= Cutibacterium acnes, C=Corynebacterium, 573 S=Staphylococcus. (B) Boxplot showing the GRiD score of microbes over time and grouped by 574 skin site characteristics. (C) The heatmap shows partial Spearman correlation coefficient values, 575 correcting for multiple measurements, between growth rate (GRiD) and relative abundance. Black 576 colors indicate no significant correlation. The scatter plots below shows correlation between C. 577 acnes growth rate (GRiD) and relative abundance from representative skin sites. (D) Boxplot 578 showing microbial growth rate (GRiD) in healthy and primary immunodeficiency cohorts. 579 Significant differences between groups were computed with Wilcoxon rank-sum test. 580 581 Figure 3. Structural variants in Staphylococcus epidermidis. 582 (A) Venn diagram showing number of fragments differentially enriched in sebaceous sites when 583 compared with dry, moist, or foot sites. (B) Phylogenetic tree constructed for genes identified in 584 candidate fragments of S. epidermidis and S. capitis. Genes from both species clustering together 585 are highlighted. (C) Candidate S. epidermidis fragment, corresponding gene, and functional 586 annotation. (D) Spearman correlation between candidate S. epidermidis gene fragments and 587 relative abundance or growth rate (GRiD) in sebaceous and foot sites. 588 589 Figure 4. Horizontal transfer of genes in skin community. 590 (A) Overview of the HGT candidate identification process. Metagenomic reads were assembled 591 and contigs belonging to the same species were pooled into a species draft genome. For each 592 pair of species draft genomes, orthologous gene pairs were predicted. If a gene pair had a 593 significantly smaller synonymous distance than the immobile gene pairs (that is, the universal 594 single-copy orthologs), the pair of genes were identified as HGT candidates. (B) Distribution of 595 functions (i.e., KEGG BRITES annotations) of all identified horizontal gene transfer (HGT) 596 candidates and candidates that were annotated as transporters. (C) Detailed functions (i.e., 597 KEGG orthologs) of HGT candidates annotated as metallic cation, iron-siderophore, and vitamin 598 B12 transporters identified in the sebaceous sites. (D) Networks representing the top 10 species 599 pairs for which HGT events were most frequent (i.e., HGT events were identified in the largest 600 amount of samples) in each type of skin site. Nodes represent species and edges represent HGT 601 events. In each type of skin site, the size of a node is proportional to the degree of that node. 602

603 604 SUPPLEMENTARY INFORMATION 605 Supplementary Figure 1. Skin sites, skin physiologic characteristics, and number of 606 samples used. Overview of 698 samples analyzed in this study, encompassing 15 healthy adults 607 and two hyper-IgE patients, three timepoints, and 17 skin sites, representing 4 608 microenvironments; dry, moist, sebaceous, and foot sites. The numbers adjacent each site 609 correspond to the total number of samples derived from those sites. 610 611 Supplementary Figure 2. Flowchart of hybrid de novo and reference-based approach to 612

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 15: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

15

resolve microbial dark matter. The inset boxplot represents the percentage of reads mapping 613 back to contigs/scaffolds (> 1 Kb) catalog after each round of iterative assembly. Black lines 614 indicate median; boxes first and third quartiles. 615 616 Supplementary Figure 3. Hybrid de novo and reference-based resolution of dark matter. 617 (A) Scatterplots show concordance in relative abundance between the original classification in 618 Oh et al.6 and the hybrid approach. Points deviating significantly from the diagonal are those 619 whose relative abundance changed significantly based on the resolution of dark matter. Major 620 genera, phyla, and kingdoms are shown. (B) Annotation of uncharacterized bins. Boxplots shows 621 proportion of uncharacterized bins reassigned to a taxa. We relaxed our initial annotation 622 requirement by excluding contigs/scaffolds with no hits from MEGAN output and re-ran our 623 annotation pipeline. 624 625 Supplementary Figure 4. Higher compositional reconstruction of the skin microbiome. 626 (A) Community stability across skin site as calculated by the Yue-Clayton theta index, where q~1 627 represents a completely stable community. “Long” refers to sampling time interval between T1 628 and T2 while “Short” represents short sampling time interval between T2 to T3. (B) Heatmap 629 shows partial Spearman correlation values correcting for multiple measurements between 630 microbial relative abundance at timepoints T2 vs T3 (i.e. short time interval) (top) and T1 vs T2 631 (i.e. long time interval) (bottom). Black colors indicate no correlation. C. acnes=Cutibacterium 632 acnes, C.=Corynebacterium, S.=Staphylococcus. 633 634 Supplementary Figure 5. HGT candidate identification pipeline. 635 636 Supplementary Figure 6. Validation of the HGT prediction pipeline. 637 (A) Simulated HGT genes that were identified using the pipeline between all species pairs in all 638 simulated communities (0%, 5%, and 10% mutations). (B) Number of HGT genes identified as a 639 function of the synonymous distance of the HGT genes in the species pairs. Identified HGT genes 640 that matched the simulated HGT genes were shown in red. 641 642 Supplementary Table 1: Microbial genome bins identified using de novo approach. Bins 643 highlighted in yellow represent non-microbial genomes. 644 645 Supplementary Table 2: Bin coverage across samples. 646 647 Supplementary Table 3: Community relative abundance resolved using hybrid de novo and 648 reference-based approach 649

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 16: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

A B

C

0

1

2

3

Hp Vf Ac Ic Id Pc Al Ba Ch Ea Gb Mb Oc Ra Ph Tn Tw

Shan

non

dive

rsity

inde

x

*** *** *** *** *** *** *** *** *** *** *** *** NSNS NSNS*

Reference v1

Binning and reference

E

De novoapproach Reference-based

approach

Abundance, diversity, stability, co-occurrence

Growth rate prediction

Structural variant analysis

Horizontal gene transfer prediction

Reference v1 Binning only Binning and reference

020

4060

8010

0Pe

rcen

tage

of r

eads

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Reference v1

BinningBinning and ref.

T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3

Rel

ativ

e ab

unda

nce

BacteriaProteobacteria

AlphaproteobacteriaBrevundimonas

BetaproteobacteriaGammaproteobacteria

PseudomonasAcinetobacterMoraxella

FirmicutesLactobacillales

StreptococcusClostridiales

BacillusStaphylococcus

Staphylococcus aureusStaphylococcus epidermidis

BacteroidesPrevotella

ActinobacteriaActinomycetales

GordoniaPropionibacterium

Cutibacterium acnesCorynebacterineae

CorynebacteriumCorynebacterium jeikeiumCorynebacterium ureicelerivorans

DeinococcusEukaryota

FungiMalasseziaceae

VirusesUncharacterized binsDark matter

Hp Vf Ac Ic Id Pc Al Ba Ch Ea Gb Mb Oc Ra Ph Tn Tw

Dry Moist Sebaceous Foot

0.00

0.25

0.50

0.75

1.00

Long Short Interpersonal

Long Short Interpersonal

Thet

a in

dex

*** *** ***

*** *** ***

*NS NS

****NSDry Moist

Sebaceous

0.00

0.25

0.50

0.75

1.00Foot

F

Reference v1

Binning and reference

DryMoistSebaceousFoot

DryMoistSebaceousFoot

ProteobacteriaAlphaproteobacteria

BrevundimonasBetaproteobacteria

GammaproteobacteriaPseudomonasAcinetobacter

MoraxellaFirmicutes

LactobacillalesStreptococcus

ClostridialesBacillus

StaphylococcusStaphylococcus aureus

Staphylococcus epidermidisBacteroides

Prevotella Actinobacteria

ActinomycetalesGordonia

PropionibacteriumCutibacterium acnes

CorynebacterineaeCorynebacterium

Corynebacterium jeikeiumCorynebacterium ureicelerivorans

DeinococcusEukaryota

FungiMalasseziaceae

PhagesEukaryotic viruses

D

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 17: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

Dry Moist Sebaceous Foot

S. epidermidis

C. jeikeium

C. ureicelerivorans

S. aureus

C. acnes

Mostly growing Absent/mostly stationary

1.0

1.5

2.0

2.5

3.0

1.0

1.5

2.0

2.5

3.0

GR

iD

T1T2T3

Dry Moist

Sebaceous Foot

TwTnPhRaOcMbGbEaChBaAlPcIdIcAcVfHp Dry

MoistSebaceousFoot

Spe

arm

an c

orre

latio

n

−0.5

00.

5

1.00

1.05

1.10

1.15

0.00.20.40.60.8

0.0

0.2

0.4

1.00

1.05

1.10

1.15

1.02

1.05

1.08

1.11

0.00

0.25

0.50

0.75

1.00

51.

015

1.02

51.

035

0.0000.0020.0040.006

Vf Id Al Tn

Rel

ativ

e A

bund

ance

GRiD

1.0

1.5

2.0

2.5

GR

iD

healthy

hyper_IgE

NS NS NS P = 0.013p = 0.002

A B

C D

C. jeikeium

C. ureicelerivorans

C. acnesS. aureus

S. epidermidis

C. jeikeium

C. ureicelerivorans

C. acnesS. aureus

S. epidermidis

C. jeikeium

C. ureicelerivoransC. acnes

S. aureus

S. epidermidis

C. jeikeium

C. ureicelerivoransC. acnes

S. aureus

S. epidermidis

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 18: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

16377

1925

7

6

28

12

29365

157

17

11

13

3

v. dry

v. moist

v. foot v. dry

v. moist

v. foot

enriched in sebaceous sites

S. epidermidis S. capitis

S_ca

pitis

_033

33S_

capi

tis_0

3332

S_ca

pitis

_033

22S_

epi_

1964

0S_

capi

tis_0

3029

S_ep

i_20

627

S_ep

i_19

594

S_ca

pitis

_029

38S_

capi

tis_0

3512

S_ca

pitis

_032

99S_

capi

tis_0

0325

S_ep

i_19

593 S_

epi_

1409

0S_

capi

tis_0

3320

S_ca

pitis

_029

39S_

epi_

1361

9 S_ca

pitis

_033

00S_

capi

tis_0

3321

S_ca

pitis

_033

25S_

epi_

1362

0

01

23

Hei

ght

S_epi.fa_11535001-11536000 S_epi_13619 CsbD stress response proteinS_epi.fa_11535001-11536000 S_epi_13620 hypothetical proteinS_epi.fa_12070001-12071000 S_epi_14090 hypothetical proteinS_epi.fa_17193001-17194000 S_epi_19593 hypothetical proteinS_epi.fa_17193001-17194000 S_epi_19594 transcriptional regulatorS_epi.fa_17226001-17227000 S_epi_19640 hypothetical proteinS_epi.fa_18242001-18243000 S_epi_20627 hypothetical protein

Fragment Gene Functional annotation

Plasmid-bornePhages

cor.

with

rel a

bund

cor.

with

gro

wth

rate

S_epi.fa_18242001−18243000S_epi.fa_17080001−17081000S_epi.fa_12070001−12071000S_epi.fa_17193001−17194000S_epi.fa_17226001−17227000S_epi.fa_11535001−11536000

Foot site

cor.

with

gro

wth

rate

cor.

with

rel a

bund

Sebaceous site

Spearman correlation

A B

C D

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 19: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

Dry

Sebaceous

Moist

Toenail

Dry

Sebaceous

Moist

Toenail

Rela

tive

abun

danc

es

Rela

tive

abun

danc

es

0.0

Metagenomic reads

Species 1 draft genomeSpecies 2 draft genome

Immobile gene pairs

Distance 0.47 0.48 0.52 Test gene pairs

Predicted HGT gene Distance

Distance=0.08

Probability

Dry

Sebaceous

Moist

Toenail

C. acnes

G. bronchialis

L. clevelandensis

P. fl uorescens

M. aurumC. simulans

C. ureicelerivorans

C. granulosum

M. luteus

A. oris

L. clevelandensis

C. aurimucosum

C. acnes

C. jeikeium

C. ureicelerivorans

M. luteusP. alcaligenes

G. vaginalis

C. granulosumS. maltophilia

C. acnes

L. clevelandensis S. epidermidis

C. kroppenstedtii

C. granulosum

S. capitis

C. singulare

C. frankenforstense

P. acnes

C. singulare

M. luteusK rhizophila

S. simulans

A. prevotii

S. warneriA. mediterraneensis

S. epidermidis

S. camporealensis

S. pettenkoferi

Predicted HGT candidates

Not identi fied in the HGT

gene pool

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

ABC

-2 ty

pe a

nd o

ther

tran

spor

ters

Acce

ssor

y fa

ctor

s in

tran

spor

tAq

uapo

rins

and

neut

ral s

olut

e tr

ansp

orte

rsD

rug

tran

spor

ters

Elec

troc

hem

ical

pot

entia

l-driv

en tr

ansp

orte

rsEn

zym

e I a

nd H

Pr in

Pho

spho

tran

sfer

ase

syst

emEn

zym

eII

in P

hosp

hotr

ansf

eras

e sy

stem

Met

allic

cat

ion,

iron

-sid

erop

hore

, and

VB1

2 tr

ansp

orte

rsM

etal

tran

spor

ters

Min

eral

and

org

anic

ion

tran

spor

ters

Nitr

ate/

nitr

ite tr

ansp

orte

rsO

rgan

ic a

cid

tran

spor

ters

Pept

ide

and

nick

el tr

ansp

orte

rsPh

osph

ate

and

amin

o ac

id tr

ansp

orte

rsPh

osph

ate

and

orga

noph

osph

ate

tran

spor

ters

Pore

s io

n ch

anne

lsPr

otei

n tr

ansp

orte

rsSa

ccha

ride,

pol

yol,

and

lipid

tran

spor

ters

Sodi

um b

ile s

alt c

otra

nspo

rter

Type

II N

a+-p

hosp

hate

cot

rans

port

erM

ultid

rug

and

toxi

n ex

trus

ion

fam

ilySu

gar t

rans

port

ers

Tran

smem

bran

e el

ectr

on c

arrie

rsU

nkno

wn

tran

spor

ters

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

Phot

osyn

thes

is p

rote

ins

Cyto

chro

me

P450

G

lyco

sam

inog

lyca

n bi

ndin

g pr

otei

nPr

otei

n ki

nase

sPe

ptid

ases

Gly

cosy

ltran

sfer

ases

Li

pid

bios

ynth

esis

pro

tein

sLi

popo

lysa

ccha

ride

bios

ynth

esis

pro

tein

sPr

enyl

tran

sfer

ases

Am

ino

acid

-rel

ated

enz

ymes

Poly

ketid

e bi

osyn

thes

is p

rote

ins

Phos

phat

ases

and

ass

ocia

ted

prot

eins

Tran

spor

ters

Two-

com

pone

nt s

yste

mBa

cter

ial m

otili

ty p

rote

ins

Bact

eria

l tox

ins

Secr

etio

n sy

stem

pro

tein

sPr

okar

yotic

def

ense

sys

tem

Tran

scrip

tion

fact

ors

Tran

slat

ion

fact

ors

Tran

sfer

RN

A b

ioge

nesi

sM

esse

nger

RN

A b

ioge

nesi

sTr

ansc

riptio

n m

achi

nery

Mito

chon

dria

l bio

gene

sis

DN

A re

plic

atio

n pr

otei

nsCh

rom

osom

e an

d as

soci

ated

pro

tein

sPr

otea

som

eCh

aper

ones

and

fold

ing

cata

lyst

sD

NA

repa

ir an

d re

com

bina

tion

prot

eins

CD m

olec

ules

Ubi

quiti

n sy

stem

Exos

ome

Cell

adhe

sion

mol

ecul

es a

nd li

gand

sCy

tosk

elet

on p

rote

ins

A

B

C

D

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 20: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

Front Back

Retroauricular crease (Ra)

Occiput (Oc)

Back (Ba)

(Gb) Glabella

(Ea) External auditory canal

(Mb) Manubrium

(Ac) Antecubital fossa

(Vf) Volar forearm

(Hp) Hypothenar palm

(Ic) Inguinal crease

(Tw) Toe web space

Toenail (Tn)

Plantar heel (Ph)

Sebaceous Moist Dry

(Pc) Popliteal fossa

(Id) Interdigital web

Alar crease (Al)(Ch) Cheek

Foot

43

40

37

39

38

41

40

43

40

41

44

41

36

35

43

35

41

Additional samples

13 Nares2 Axilla6 Control mock community

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 21: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

Concatenate all samples

Contigs

Contigs (R1)

Unmapped reads

Concatenate a subset unmapped reads from selected samples

Assembly subset, extract contigs/s-caffolds > 1 Kb, concatenate with previous contig pool (i.e. R1)

Contigs/scaffolds (Rn)

Genome bins

Unbinned reads

Pathoscope-assigned reads

Sample pool(n = 698)

de novo assembly(MEGAHIT)

extract contigs > 1 Kb

binning(MetaBAT)

extract unmapped reads

Pathoscope

repe

at s

tep

until

n =

5(d

e no

vo a

ssem

bly

with

SPA

des)

bow

tie m

appi

ng

R1 R2 R3 R4 R5

020

4060

8010

0Pe

rcen

tage

of r

eads

map

ping

MEGAHIT SPAdes

dark matter reads = total number of reads - (number of binned reads + number of Pathoscope-assigned reads)

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 22: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

0.000.050.100.150.20

0.00 0.05 0.10 0.15 0.20

Dry

0.00.10.20.30.40.5

0.0 0.1 0.2 0.3 0.4 0.5

Moist

0.0

0.2

0.4

0.6

0.0 0.2 0.4 0.6

Sebaceous

0.00

0.02

0.04

0.00 0.02 0.04

Foot

0.0

0.1

0.2

0.3

0.0 0.1 0.2 0.30.00.10.20.30.40.5

0.0 0.1 0.2 0.3 0.4 0.50.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.0000.0250.0500.0750.100

0.00 0.025 0.050 0.075 0.100

0.00.10.20.30.40.5

0.0 0.1 0.2 0.3 0.4 0.50.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.00.10.20.30.40.5

0.0 0.1 0.2 0.3 0.4 0.50.00.10.20.30.4

0.0 0.1 0.2 0.3 0.4

0.000.050.100.150.200.25

0.00 0.05 0.10 0.15 0.20 0.250.0

0.1

0.2

0.3

0.0 0.1 0.2 0.30.00.10.20.30.4

0.0 0.1 0.2 0.3 0.40.000.050.100.150.20

0.00 0.05 0.10 0.15 0.20

0.0

0.2

0.4

0.6

0.8

0.0 0.2 0.4 0.6 0.80.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.000.250.500.751.00

0.00 0.25 0.50 0.75 1.000.000.010.020.030.04

0.00 0.01 0.02 0.03 0.04

0.0

0.1

0.2

0.3

0.0 0.1 0.2 0.30.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.0

0.2

0.4

0.0 0.2 0.40.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

0.00

0.05

0.10

0.15

0.00 0.05 0.10 0.150.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.00

0.25

0.50

0.75

0.00 0.25 0.50 0.75

0.0000.0050.0100.0150.020

0.000 0.005 0.010 0.015 0.0200.000.050.100.150.200.25

0.00 0.05 0.10 0.15 0.20 0.250.0

0.1

0.2

0.0 0.1 0.20.0

0.2

0.4

0.6

0.0 0.2 0.4 0.6

0.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.0

0.2

0.4

0.6

0.0 0.2 0.4 0.60.00

0.25

0.50

0.75

0.00 0.25 0.50 0.750.0

0.1

0.2

0.3

0.0 0.1 0.2 0.3

Rel

ativ

e ab

unda

nce

(Bin

ning

and

refe

renc

e)

Relative abundance (Reference v1)

Eukaryota

Firmicutes

Proteobacteria

Actinobacteria

C. acnes

Corynebacterium

Staphylococcus

S. epidermidis

Viruses

0.00

0.25

0.50

0.75

1.00

Rel

ativ

e ab

unda

nce Proteobacteria

Bacteriodetes

Firmicutes

Actinobacteria

Other Bacteria

Eukaryota

A

B

Dry Foot Moist Sebaceous

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 23: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Thet

a In

dex

Reference v1 Binning and referenceLo

ng

Shor

tIn

ter

pers

onal

Long

Shor

tIn

ter

pers

onal

Long

Shor

tIn

ter

pers

onal

Long

Shor

tIn

ter

pers

onal

Long

Shor

tIn

ter

pers

onal

Long

Shor

tIn

ter

pers

onal

A

TwTnPhRaOcMbGbEaChBaAlPcIdIcAcVfHp

0 0 0 0 0 0.73 0.69 0.72 0.68 0.63 00 0.71 0.79 0 0 0 0 0 0 0.74 00 0 0 0 0 0 0 0 0 0 0

0.7 0.82 0.92 0.92 0.85 0.98 0 0.62 0 0.89 0.690.84 0.84 0 0.91 0 0.84 0.75 0.7 0.65 0.83 0.850.69 0.72 0.72 0.79 0.81 0.88 0.86 0.87 0 0.8 0.740.82 0.75 0.78 0.85 0.9 0.92 0.89 0.71 0.67 0.64 0.750.8 0.76 0 0.69 0.71 0.75 0.65 0.95 0 0.68 0.690.8 0.69 0.82 0.9 0 0.74 0 0.73 0 0 0.820 0.78 0.73 0.72 0.85 0.8 0 0.72 0 0 0

0.71 0.61 0.85 0.81 0.78 0.64 0 0 0 0 0.710.76 0.9 0.82 0.63 0 0 0 0 0 0.62 00 0.95 0 0 0 0 0 0 0 0 00 0.85 0.7 0 0 0.84 0 0 0 0 00 0.77 0 0 0 0 0 0 0 0 0

0.8 0.67 0 0 0 0 0.83 0 0 0 00 0.75 0 0.65 0 0.89 0 0 0.7 0 0

Cor

yneb

acte

rium

C. j

eike

ium

C. u

reic

eler

ivor

ans

Mal

asse

ziac

eae

Pro

pion

ibac

teriu

m

C. a

cnes

Pse

udom

onas

Sta

phyl

ococ

cus

S. a

ureu

s

S. e

pide

rmid

is

Stre

ptoc

occu

s

TwTnPhRaOcMbGbEaChBaAlPcIdIcAcVfHp

0 0.64 0.85 0 0 0 0 0.9 0 0 0.880 0 0 0 0 0 0 0 0 0 00 0.65 0 0 0 0 0 0 0 0 00 0 0.66 0.69 0.74 0.79 0 0.66 0 0 0.71

0.77 0.82 0.79 0.69 0.66 0.61 0 0 0 0 0.610.65 0.77 0 0.86 0.73 0.76 0 0 0 0 00.79 0.67 0.88 0.72 0.65 0 0 0 0 0 0.730.9 0 0 0.68 0.62 0.92 0 0.92 0 0 00 0 0.61 0.83 0 0.67 0 0.69 0 0 0

0.68 0.96 0.73 0.77 0 0.75 0 0 0.79 0 0.820.89 0 0.78 0 0 0 0 0.68 0 0 0.850 0.68 0.83 0 0 0.75 0 0 0 0 0

0.81 0.83 0 0 0.86 0 0 0 0 0 00.65 0.85 0.79 0.85 0 0 0 0 0 0.64 00 0.66 0 0 0.67 0.7 0 0 0 0 0

0.9 0.73 0.73 0.7 0.73 0 0 0 0 0 00.63 0.75 0 0.68 0 0 0 0 0 0 0

B

DryMoistSebaceousFoot

Hp Vf Ac Ic Id Pc

Al Ba Ch Ea Gb Mb

Oc Ra Ph Tn Tw

DryMoistSebaceousFoot

0

0.2

0.4

0.6

0.8

1

Spea

rman

cor

rela

tion

coef

ficie

nt

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 24: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint

Page 25: An enhanced characterization of the human skin microbiome ... · 1/21/2020  · 1 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions

Genes horizontally transferred from

C.. acnes S. epidermidis S. mitis

75

50

25

0

100

75

50

25

0

120

80

40

0

All predicted HGTs Predicted HGTs that match simulated HGTs

Cou

nts

of H

GT

Synonymous distance

0 0.5 1.0 1.5 2.0 2.50 1.0 2.0 3.00 1.0 2.0 3.0

C. acnes - S. epidermidis

C. acnes - S. mitis

S. epidermidis - S. mitis

C. acnes - S. epidermidis

C. acnes - S. mitis

S. epidermidis - S. mitis

C. acnes - S. epidermidis

C. acnes - S. mitis

S. epidermidis - S. mitis

0% m

utation 5% m

utation 10% m

utation

0% mutation 5% mutation 10% mutation

A B

successfully identified HGT gene

HGT gene not recognized by prodigal

HGT gene without fully assembled sequence

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 23, 2020. . https://doi.org/10.1101/2020.01.21.914820doi: bioRxiv preprint


Recommended