+ All Categories
Home > Documents > Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell...

Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell...

Date post: 12-Feb-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
33
From metagenomic contigs to draft genomes Binning Daan Speth [email protected] @daanspeth
Transcript
Page 1: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

From metagenomic contigs to draft genomes

Binning

Daan [email protected]

@daanspeth

Page 2: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

The problem

Binning: clustering sequences with the same origin together

A corner piece? GREAT! But where is the rest of the puzzle?

Drew Sheneman, New Jersey -- The Newark Star Ledger

Page 3: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Potato processing wastewater treatment plant at Olburgen, The Netherlands

Stable system operated since 2006

Images:Left & Middle Abma et al. Water Science & Technology (2010)

Study site

Page 4: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

nitritation/ anammox reactor (600 m3)

5.0 m

0.2 m

1.4 m

2.6 m

3.8 m

total sample

washed granules

1

2

3

4

5

6

7

8

total sample

washed granules

DNA isolation

Organic extraction

Powersoil kit

Organic extraction

Powersoil kit

Organic extraction

Powersoil kit

Organic extraction

Powersoil kit

Sampling strategy: 8 samples

Sample treatmentSample location DNA isolation

Page 5: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Data handles

Sequence composition

Prior knowledge (Databases)

Sequence abundance

Mate pair & Paired end

Page 6: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Data handles: mate pair and paired end

Page 7: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Data handles: mate pair and paired end

Page 8: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Data handles: databases

Page 9: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Data handles: composition

Limited chemical signature

Biological information- Codon usage (tetramer frequency)

‘Unique’ long k-mers

Contig/read length matters!

Page 10: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

DNA isolation and

library preparation

sequencing and assembly

Data handles: abundance

Abundance in the sample correlates with abundance in reads

Page 11: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Many roads try to get to Rome

Reference based and reference independent binning methods

Mande, S. S., Mohammed, M. H. & Ghosh, T. S. Classification of metagenomic sequences: methods and challenges. Briefings in Bioinformatics 13, 669–681 (2012).

Page 12: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Many roads try to get to Rome

Composition: - GC content- Tetranucleotide frequencies

Abundance - Long k-mer copy number- Contig coverage

Content- Essential single copy genes

Mande, S. S., Mohammed, M. H. & Ghosh, T. S. Classification of metagenomic sequences: methods and challenges. Briefings in Bioinformatics 13, 669–681 (2012).

Page 13: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 14: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 15: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Assembly independent binning

Wang, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample. Bioinformatics 28, i356–i362 (2012).

T = long kmer abundance

w = long kmer length

Page 16: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

Assembly independent read binning

Binning on GC content and Sequencing depth

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 17: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Separating genomes: binning

Binning based on coverage and GC content

Se

quen

cin

g de

pth

GC content

Page 18: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

(This is not an exhaustive list…)

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 19: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning: tetranucleotide ESOM

Dick, G. J., Andersson, A. F., Baker, B. J. & Simmons, S. L. Community-wide analysis of microbial genome sequence signatures. Genome Biology (2009).

Page 20: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Emergent Self Organizing Map (ESOM) based on tetranucleotide frequency

Binning: tetranucleotide ESOM

Dick, G. J., Andersson, A. F., Baker, B. J. & Simmons, S. L. Community-wide analysis of microbial genome sequence signatures. Genome Biology (2009).

Page 21: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

(This is not an exhaustive list…)

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 22: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Using nucleotide extraction bias to separate organisms

Albertsen, M. et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 31, 533–538 (2013).

Binning: differential coverage binning

http://madsalbertsen.github.io/multi-metagenome/

Page 23: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

(This is not an exhaustive list…)

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 24: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

differential coverage binning: crAss

Page 25: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

differential coverage binning: groopM

http://minillinim.github.io/GroopM/

1. Imelfort, M., Parks, D., Woodcroft, B. J. & Dennis, P. GroopM: An automated tool for the recovery of population genomes from related metagenomes. (2014).

Page 26: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

differential coverage binning: concoct

1. Alneberg, J. et al. CONCOCT: Clustering cONtigs on COverage and ComposiTion. (2013).

Page 27: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

differential coverage binning: ESOM

1. Kantor, R. S. et al. Small Genomes and Sparse Metabolisms of Sediment-Associated Bacteria from Four Candidate Phyla. MBio 4, e00708–13–e00708–13 (2013).

Page 28: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

differential coverage binning: ESOM

1. Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol 32, 822–828 (2014).

Page 29: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Binning approaches

(This is not an exhaustive list…)

Assembly independent read binning

Binning on GC content and coverage

Tetranucleotide ESOM

Differential coverage based binning- Nuceotide extraction bias- Different samples

Hi-C Metagenomics

Page 30: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Determining what belongs together by crosslinking total cell content

1) Crosslink2) Cut DNA3) Religate randomly4) Sequence paired end labrary of both crosslinked and native sample

Binning: Hi-C metagenomics

Beitel, C. W. et al. Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. (2014). doi:10.7287/peerj.preprints.260v1

Page 31: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Clustering by organism (and even replicon!)

Beitel, C. W. et al. Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. (2014). doi:10.7287/peerj.preprints.260v1

Binning: Hi-C metagenomics

Page 32: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

Roads less travelled…Whichever method you choose, do a background check…

Page 33: Binning - KNAW · 2014-09-23 · Determining what belongs together by crosslinking total cell content 1) Crosslink 2) Cut DNA 3) Religate randomly 4) Sequence paired end labrary of

When analyzing a complex community,

experimental design largely determines how much you can get out

Binning: concluding remarks


Recommended