+ All Categories
Home > Documents > ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis...

ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis...

Date post: 07-Aug-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
38
ChIP-Seq data and analysis Bori Mifsud Postdoc in Luscombe group 18.09.2014 Computational biology UCL-LRI
Transcript
Page 1: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

ChIP-Seq data and analysis

Bori Mifsud

Postdoc in Luscombe group

18.09.2014

Computational biology UCL-LRI

Page 2: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Why do Chromatin Immunoprecipitation (ChIP)?

~99.9% identical genetic material

Page 3: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

100% identical genetic material

Page 4: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Proteins DNA RNA

transcription translation

Page 5: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

ChIP to understand transcriptional regulation!

Map regulatory elements: Transcription Factors

–ChIP Histone marks

–ChIP DNA Methylation

–MeDIP etc. Nucleosomes RNA Polymerase

–Pol II ChIP

Page 6: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

ChIP-seq protocol

Page 7: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Analysis of ChIP-seq data

Differential binding analysis –Occupancy-based analysis –Affinity-based analysis

Validation and downstream analysis

–Motif analysis –Annotation –Integrating binding and expression data

Experimental design –Controls and replicates

QC/Read processing –Library QC –Alignment and filtering –QC measures and assessment

Peak calling –Peak callers

Page 8: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

ENCODE project

Landt et al. (2012) ChIP-seq guidelines and practices of the ENCODE and modENCODE Consortia. Genome Research 22: 1813-1831

Chen et al. (2012) Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods 9: 609

Page 9: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 10: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 11: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 12: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 13: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 14: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 15: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Consideration 2: Why do you need controls?

Page 16: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Consideration 2: Why do you need controls?

Page 17: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

• Non-uniform fragmentation (euchromatin-heterochromatin)

• GC sequncing bias

Consideration 2: Why do you need controls?

[Chen et al, 2012]

Page 18: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Consideration 2: Why do you need controls?

Page 19: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 20: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Consideration 2: Why do you need controls?

The more sequencing depth you have for the input the better you can identify peaks!

[Chen et al, 2012]

Page 21: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

(over 100 million reads – HiSeq)

Page 22: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Transcription factor – tight, highly-peaked binding region

RNA PolII – enriched at TSS but bound throughout gene

body

ChIP-Seq data from fly S2 cells

Proteins bind in different ways

Page 23: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Activating mark (near TSS)

Peaks within body of active genes

Peaks within body of inactive genes

Page 24: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 25: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

10

Supplementary Figure 10. The change in identified ChIP‐enriched regions of (a) Su(Hw) and (b) H3K36me3 with respect to the regions that were identified using the complete data is shown with the increase of sequencing depth for different algorithms. Macs‐f3 and Useq‐f3 denote the Su(Hw) regions that have more than 3 fold enrichment and were identified by Macs and Useq, respectively.

Nature Methods: doi:10.1038/nmeth.1985

Consideration 3: Sequencing depth (optimum is different for different peak finder software)

[Chen et al, 2012]

Plateau for most peak finders ~16.2 M reads in Drosophila (corresponding to ~327 M reads in human)

Page 26: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

• There is a difference when you assess the complexity of the sample

Reproducibility information gives confidence in peaks, helps choosing thresholds (IDR)

Page 27: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Data processing steps

schematic of

ChIP-seq

experiments

[Park et al, 2009]

ChIP

sequencing

alignment

peak-finding

Page 28: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Quality control

• Read quality

• Sequence content • Duplication (PCR artefacts) • Library complexity (overrepresented sequences)

• Contamination

Many tools (SAMstat, htSeqTools, fastQC etc.)

Page 29: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Quality control

• Read quality

• Sequence content • Duplication (PCR artefacts) • Library complexity (overrepresented sequences)

• Contamination

Many tools (SAMstat, htSeqTools, fastQC etc.)

Page 30: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Quality control

• Read quality

• Sequence content • Duplication (PCR artefacts) • Library complexity (overrepresented sequences)

• Contamination

Many tools (SAMstat, htSeqTools, fastQC etc.)

Page 31: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Quality control

• Read quality

• Sequence content • Duplication (PCR artefacts) • Library complexity (overrepresented sequences)

• Contamination

Many tools (SAMstat, htSeqTools, fastQC etc.)

Page 32: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

e.g. BWA, Bowtie

Page 33: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif
Page 34: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Strand information for quality control

[Landt et al, 2012]

Page 35: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Basic idea: Count the number of reads in windows and determine whether this number is above background – if so, define that region as bound

Page 36: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

MACS 2.0 USeq SISSRs

Calculating peakshift for 1000 best peaks Shift reads 3’ Identify potentially bound regions Calculate enrichment and significance using poisson distribution with local λ

Calculating peakshift Shift reads 3’ Define windows Calculate enrichment per window, significance using negative binomial Join regions that are within max gap eFDR

Estimate fragment length (mean sense-antisense dist) Windows with w/2 shift through genome Define potential peaks by transition in net tag count (n sense-nantisense) Calculate enrichment and significance using poisson

Page 37: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

[Park 2009]

Downstream of ChIP

Page 38: ChIP-Seq data and analysis · Analysis of ChIP-seq data Differential binding analysis –Occupancy-based analysis –Affinity-based analysis Validation and downstream analysis –Motif

Landt et al. (2012) ChIP-seq guidelines and practices of the ENCODE and modENCODE Consortia. Genome Research 22: 1813-1831

Chen et al. (2012) Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods 9: 609 Meyer & Liu (2014) Identifying and mitigating bias in next generation sequencing methods for chromatin biology. Nature Reviews Genetics doi:10.1038/nrg3788

References:


Recommended